Source author record

Bin Dong

Bin Dong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

33works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Hilbert-Geo: Solving Solid Geometric Problems by Neural-Symbolic Reasoning

Geometric problem solving, as a typical multimodal reasoning problem, has attracted much attention and made great progress recently, however most of works focus on plane geometry while usually fail in solid geometry due to 3D spatial diagrams and complex reasoning. To bridge this gap, we introduce Hilbert-Geo, the first unified formal language framework for solid geometry, including an extensive predicate library and a dedicated theorem bank. Based on this framework, we propose a Parse2Reason method containing two steps of first parsing then reasoning. In the parsing step, we utilize conditional description language (CDL), a formalized language composed of predicates specifically designed to construct geometric conditions, to represent both problem description (natural text) and solid diagrams (visual image). In the reasoning step, we leverage those formal CDL and the theorem bank to perform relational inference and algebraic computation, generating strictly correct, verifiable, and human-readable reasoning processes. Notably, our proposed Hilbert-Geo is also applicable to plane geometry. To advance geometric reasoning, we curate two expert-annotated dataset SolidFGeo2k and PlaneFGeo3k, which are furnished with geometric formal language annotations, solutions and answers. Extensive experiments show that our proposed method achieves the state-of-the-art (SOTA) performance 77.3% in SolidFGeo2k and 84.1% in MathVerse-Solid (one small subset in MathVerse dedicated to solid geometry), substantially outperforming leading MLLMs, such as Gemini-2.5-pro (54.2% on SolidFGeo2k) and GPT-5 (62.9% on MathVerse-Solid). In addition, our method achieves the SOTA accuracy 80.2% in PlaneFGeo3k, demonstrating the generality of the Hilbert-Geo in geometric reasoning. Our code and datasets will be publicly available.

preprint2024arXiv

Analysis of a wavelet frame based two-scale model for enhanced edges

Image restoration is a class of important tasks that emerges from a wide range of scientific disciplines. It has been noticed that most practical images can be modeled as a composition from a sparse singularity set (edges) where the image contents or their gradients change drastically, and cartoon chunks in which a high degree of regularity is dominant. Enhancing edges while promoting regularity elsewhere has been an important criterion for successful restoration in many image classes. In this article, we present a wavelet frame based image restoration model that captures potential edges and facilitates the restoration procedure by a dedicated treatment both of singularity and of cartoon. Moreover, its geometric robustness is enhanced by exploiting subtle inter-scale information available in the coarse image. To substantiate our intuition, we prove that this model converges to one variant of the celebrated Mumford-Shah model when adequate asymptotic specifications are given.

preprint2022arXiv

A Note on Machine Learning Approach for Computational Imaging

Computational imaging has been playing a vital role in the development of natural sciences. Advances in sensory, information, and computer technologies have further extended the scope of influence of imaging, making digital images an essential component of our daily lives. For the past three decades, we have witnessed phenomenal developments of mathematical and machine learning methods in computational imaging. In this note, we will review some of the recent developments of the machine learning approach for computational imaging and discuss its differences and relations to the mathematical approach. We will demonstrate how we may combine the wisdom from both approaches, discuss the merits and potentials of such a combination and present some of the new computational and theoretical challenges it brings about.

preprint2022arXiv

A scalable deep learning approach for solving high-dimensional dynamic optimal transport

The dynamic formulation of optimal transport has attracted growing interests in scientific computing and machine learning, and its computation requires to solve a PDE-constrained optimization problem. The classical Eulerian discretization based approaches suffer from the curse of dimensionality, which arises from the approximation of high-dimensional velocity field. In this work, we propose a deep learning based method to solve the dynamic optimal transport in high dimensional space. Our method contains three main ingredients: a carefully designed representation of the velocity field, the discretization of the PDE constraint along the characteristics, and the computation of high dimensional integral by Monte Carlo method in each time step. Specifically, in the representation of the velocity field, we apply the classical nodal basis function in time and the deep neural networks in space domain with the H1-norm regularization. This technique promotes the regularity of the velocity field in both time and space such that the discretization along the characteristic remains to be stable during the training process. Extensive numerical examples have been conducted to test the proposed method. Compared to other solvers of optimal transport, our method could give more accurate results in high dimensional cases and has very good scalability with respect to dimension. Finally, we extend our method to more complicated cases such as crowd motion problem.

preprint2022arXiv

Learning invariance preserving moment closure model for Boltzmann-BGK equation

As one of the main governing equations in kinetic theory, the Boltzmann equation is widely utilized in aerospace, microscopic flow, etc. Its high-resolution simulation is crucial in these related areas. However, due to the high dimensionality of the Boltzmann equation, high-resolution simulations are often difficult to achieve numerically. The moment method which was first proposed by Grad is among the popular numerical methods to achieve efficient high-resolution simulations. We can derive the governing equations in the moment method by taking moments on both sides of the Boltzmann equation, which effectively reduces the dimensionality of the problem. However, one of the main challenges is that it leads to an unclosed moment system, and closure is needed to obtain a closed moment system. It is truly an art in designing closures for moment systems and has been a significant research field in kinetic theory. Other than the traditional human designs of closures, the machine learning-based approach has attracted much attention lately in Han et al. and Huang et al. In this work, we propose a machine learning-based method to derive a moment closure model for the Boltzmann-BGK equation. In particular, the closure relation is approximated by a carefully designed deep neural network that possesses desirable physical invariances, i.e., the Galilean invariance, reflecting invariance, and scaling invariance, inherited from the original Boltzmann-BGK equation and playing an important role in the correct simulation of the Boltzmann equation. Numerical simulations on the 1D-1D examples including the smooth and discontinuous initial condition problems, Sod shock tube problem, the shock structure problems, and the 1D-3D examples including the smooth and discontinuous problems demonstrate satisfactory numerical performances of the proposed invariance preserving neural closure method.

preprint2022arXiv

MOTR: End-to-End Multiple-Object Tracking with Transformer

Temporal modeling of objects is a key challenge in multiple object tracking (MOT). Existing methods track by associating detections through motion-based and appearance-based similarity heuristics. The post-processing nature of association prevents end-to-end exploitation of temporal variations in video sequence. In this paper, we propose MOTR, which extends DETR and introduces track query to model the tracked instances in the entire video. Track query is transferred and updated frame-by-frame to perform iterative prediction over time. We propose tracklet-aware label assignment to train track queries and newborn object queries. We further propose temporal aggregation network and collective average loss to enhance temporal relation modeling. Experimental results on DanceTrack show that MOTR significantly outperforms state-of-the-art method, ByteTrack by 6.5% on HOTA metric. On MOT17, MOTR outperforms our concurrent works, TrackFormer and TransTrack, on association performance. MOTR can serve as a stronger baseline for future research on temporal modeling and Transformer-based trackers. Code is available at https://github.com/megvii-research/MOTR.

preprint2022arXiv

Region-Aware Metric Learning for Open World Semantic Segmentation via Meta-Channel Aggregation

As one of the most challenging and practical segmentation tasks, open-world semantic segmentation requires the model to segment the anomaly regions in the images and incrementally learn to segment out-of-distribution (OOD) objects, especially under a few-shot condition. The current state-of-the-art (SOTA) method, Deep Metric Learning Network (DMLNet), relies on pixel-level metric learning, with which the identification of similar regions having different semantics is difficult. Therefore, we propose a method called region-aware metric learning (RAML), which first separates the regions of the images and generates region-aware features for further metric learning. RAML improves the integrity of the segmented anomaly regions. Moreover, we propose a novel meta-channel aggregation (MCA) module to further separate anomaly regions, forming high-quality sub-region candidates and thereby improving the model performance for OOD objects. To evaluate the proposed RAML, we have conducted extensive experiments and ablation studies on Lost And Found and Road Anomaly datasets for anomaly segmentation and the CityScapes dataset for incremental few-shot learning. The results show that the proposed RAML achieves SOTA performance in both stages of open world segmentation. Our code and appendix are available at https://github.com/czifan/RAML.

preprint2022arXiv

Trained Model in Supervised Deep Learning is a Conditional Risk Minimizer

We proved that a trained model in supervised deep learning minimizes the conditional risk for each input (Theorem 2.1). This property provided insights into the behavior of trained models and established a connection between supervised and unsupervised learning in some cases. In addition, when the labels are intractable but can be written as a conditional risk minimizer, we proved an equivalent form of the original supervised learning problem with accessible labels (Theorem 2.2). We demonstrated that many existing works, such as Noise2Score, Noise2Noise and score function estimation can be explained by our theorem. Moreover, we derived a property of classification problem with noisy labels using Theorem 2.1 and validated it using MNIST dataset. Furthermore, We proposed a method to estimate uncertainty in image super-resolution based on Theorem 2.2 and validated it using ImageNet dataset. Our code is available on github.

preprint2021arXiv

A Practical Layer-Parallel Training Algorithm for Residual Networks

Gradient-based algorithms for training ResNets typically require a forward pass of the input data, followed by back-propagating the objective gradient to update parameters, which are time-consuming for deep ResNets. To break the dependencies between modules in both the forward and backward modes, auxiliary-variable methods such as the penalty and augmented Lagrangian (AL) approaches have attracted much interest lately due to their ability to exploit layer-wise parallelism. However, we observe that large communication overhead and lacking data augmentation are two key challenges of these methods, which may lead to low speedup ratio and accuracy drop across multiple compute devices. Inspired by the optimal control formulation of ResNets, we propose a novel serial-parallel hybrid training strategy to enable the use of data augmentation, together with downsampling filters to reduce the communication cost. The proposed strategy first trains the network parameters by solving a succession of independent sub-problems in parallel and then corrects the network parameters through a full serial forward-backward propagation of data. Such a strategy can be applied to most of the existing layer-parallel training methods using auxiliary variables. As an example, we validate the proposed strategy using penalty and AL methods on ResNet and WideResNet across MNIST, CIFAR-10 and CIFAR-100 datasets, achieving significant speedup over the traditional layer-serial training methods while maintaining comparable accuracy.

preprint2021arXiv

Enhancing Certified Robustness via Smoothed Weighted Ensembling

Randomized smoothing has achieved state-of-the-art certified robustness against $l_2$-norm adversarial attacks. However, it is not wholly resolved on how to find the optimal base classifier for randomized smoothing. In this work, we employ a Smoothed WEighted ENsembling (SWEEN) scheme to improve the performance of randomized smoothed classifiers. We show the ensembling generality that SWEEN can help achieve optimal certified robustness. Furthermore, theoretical analysis proves that the optimal SWEEN model can be obtained from training under mild assumptions. We also develop an adaptive prediction algorithm to reduce the prediction and certification cost of SWEEN models. Extensive experiments show that SWEEN models outperform the upper envelope of their corresponding candidate models by a large margin. Moreover, SWEEN models constructed using a few small models can achieve comparable performance to a single large model with a notable reduction in training time.

preprint2021arXiv

NPTC-net: Narrow-Band Parallel Transport Convolutional Neural Network on Point Clouds

Convolution plays a crucial role in various applications in signal and image processing, analysis, and recognition. It is also the main building block of convolution neural networks (CNNs). Designing appropriate convolution neural networks on manifold-structured point clouds can inherit and empower recent advances of CNNs to analyzing and processing point cloud data. However, one of the major challenges is to define a proper way to "sweep" filters through the point cloud as a natural generalization of the planar convolution and to reflect the point cloud's geometry at the same time. In this paper, we consider generalizing convolution by adapting parallel transport on the point cloud. Inspired by a triangulated surface-based method [Stefan C. Schonsheck, Bin Dong, and Rongjie Lai, arXiv:1805.07857.], we propose the Narrow-Band Parallel Transport Convolution (NPTC) using a specifically defined connection on a voxel-based narrow-band approximation of point cloud data. With that, we further propose a deep convolutional neural network based on NPTC (called NPTC-net) for point cloud classification and segmentation. Comprehensive experiments show that the proposed NPTC-net achieves similar or better results than current state-of-the-art methods on point cloud classification and segmentation.

preprint2020arXiv

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CIFAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CIFAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress ($>50$\%) the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10.

preprint2020arXiv

Blind Adversarial Training: Balance Accuracy and Robustness

Adversarial training (AT) aims to improve the robustness of deep learning models by mixing clean data and adversarial examples (AEs). Most existing AT approaches can be grouped into restricted and unrestricted approaches. Restricted AT requires a prescribed uniform budget to constrain the magnitude of the AE perturbations during training, with the obtained results showing high sensitivity to the budget. On the other hand, unrestricted AT uses unconstrained AEs, resulting in the use of AEs located beyond the decision boundary; these overestimated AEs significantly lower the accuracy on clean data. These limitations mean that the existing AT approaches have difficulty in obtaining a comprehensively robust model with high accuracy and robustness when confronting attacks with varying strengths. Considering this problem, this paper proposes a novel AT approach named blind adversarial training (BAT) to better balance the accuracy and robustness. The main idea of this approach is to use a cutoff-scale strategy to adaptively estimate a nonuniform budget to modify the AEs used in the training, ensuring that the strengths of the AEs are dynamically located in a reasonable range and ultimately improving the overall robustness of the AT model. The experimental results obtained using BAT for training classification models on several benchmarks demonstrate the competitive performance of this method.

preprint2020arXiv

MetaInv-Net: Meta Inversion Network for Sparse View CT Image Reconstruction

X-ray Computed Tomography (CT) is widely used in clinical applications such as diagnosis and image-guided interventions. In this paper, we propose a new deep learning based model for CT image reconstruction with the backbone network architecture built by unrolling an iterative algorithm. However, unlike the existing strategy to include as many data-adaptive components in the unrolled dynamics model as possible, we find that it is enough to only learn the parts where traditional designs mostly rely on intuitions and experience. More specifically, we propose to learn an initializer for the conjugate gradient (CG) algorithm that involved in one of the subproblems of the backbone model. Other components, such as image priors and hyperparameters, are kept as the original design. Since a hypernetwork is introduced to inference on the initialization of the CG module, it makes the proposed model a certain meta-learning model. Therefore, we shall call the proposed model the meta-inversion network (MetaInv-Net). The proposed MetaInv-Net can be designed with much less trainable parameters while still preserves its superior image reconstruction performance than some state-of-the-art deep models in CT imaging. In simulated and real data experiments, MetaInv-Net performs very well and can be generalized beyond the training setting, i.e., to other scanning settings, noise levels, and data sets.

preprint2020arXiv

Quantization of electromagnetic modes and angular momentum on plasmonic nanowires

Quantum theory of surface plasmons is very important for studying the interactions between light and different metal nanostructures in nanoplasmonics. In this work, using the canonical quantization method, the SPPs on nanowires and their orbital and spin angular momentum are investigated. The results show that the SPPs on nanowire carry both orbital and spin momentum during propagation. Later, the result is applied on the plasmonic nanowire waveguide to show the agreement of the theory. The study is helpful for the nano wire based plasmonic interactions and the quantum information based optical circuit in the future.

preprint2020arXiv

RODE-Net: Learning Ordinary Differential Equations with Randomness from Data

Random ordinary differential equations (RODEs), i.e. ODEs with random parameters, are often used to model complex dynamics. Most existing methods to identify unknown governing RODEs from observed data often rely on strong prior knowledge. Extracting the governing equations from data with less prior knowledge remains a great challenge. In this paper, we propose a deep neural network, called RODE-Net, to tackle such challenge by fitting a symbolic expression of the differential equation and the distribution of parameters simultaneously. To train the RODE-Net, we first estimate the parameters of the unknown RODE using the symbolic networks \cite{long2019pde} by solving a set of deterministic inverse problems based on the measured data, and use a generative adversarial network (GAN) to estimate the true distribution of the RODE's parameters. Then, we use the trained GAN as a regularization to further improve the estimation of the ODE's parameters. The two steps are operated alternatively. Numerical results show that the proposed RODE-Net can well estimate the distribution of model parameters using simulated data and can make reliable predictions. It is worth noting that, GAN serves as a data driven regularization in RODE-Net and is more effective than the $\ell_1$ based regularization that is often used in system identifications.

preprint2020arXiv

Transferred Discrepancy: Quantifying the Difference Between Representations

Understanding what information neural networks capture is an essential problem in deep learning, and studying whether different models capture similar features is an initial step to achieve this goal. Previous works sought to define metrics over the feature matrices to measure the difference between two models. However, different metrics sometimes lead to contradictory conclusions, and there has been no consensus on which metric is suitable to use in practice. In this work, we propose a novel metric that goes beyond previous approaches. Recall that one of the most practical scenarios of using the learned representations is to apply them to downstream tasks. We argue that we should design the metric based on a similar principle. For that, we introduce the transferred discrepancy (TD), a new metric that defines the difference between two representations based on their downstream-task performance. Through an asymptotic analysis, we show how TD correlates with downstream tasks and the necessity to define metrics in such a task-dependent fashion. In particular, we also show that under specific conditions, the TD metric is closely related to previous metrics. Our experiments show that TD can provide fine-grained information for varied downstream tasks, and for the models trained from different initializations, the learned features are not the same in terms of downstream-task predictions. We find that TD may also be used to evaluate the effectiveness of different training strategies. For example, we demonstrate that the models trained with proper data augmentations that improve the generalization capture more similar features in terms of TD, while those with data augmentations that hurt the generalization will not. This suggests a training strategy that leads to more robust representation also trains models that generalize better.

preprint2019arXiv

Hybrid Integrated Photonics Using Bulk Acoustic Resonators

Microwave frequency acousto-optic modulation is realized by exciting high overtone bulk acoustic wave resonances (HBAR resonances) in the photonic stack. These confined mechanical stress waves transmit exhibit vertically transmitting, high quality factor (Q) acoustic Fabry Perot resonances that extend into the Gigahertz domain, and offer stress-optical interaction with the optical modes of the microresonator. Although HBAR are ubiquitously used in modern communication, and often exploited in superconducting circuits, this is the first time they have been incorporated on a photonic circuit based chip. The electro-acousto-optical interaction observed within the optical modes exhibits high actuation linearity, low actuation power and negligible crosstalk. Using the electro-acousto-optic interaction, fast optical resonance tuning is achieved with sub-nanosecond transduction time. By removing the silicon backreflection, broadband acoustic modulation at 4.1 and 8.7 GHz is realized with a 3 dB bandwidth of 250 MHz each. The novel hybrid HBAR nanophotonic platform demonstrated here, allowing on chip integration of micron-scale acoustic and photonic resonators, can find immediate applications in tunable microwave photonics, high bandwidth soliton microcomb stabilization, compact opto-electronic oscillators, and in microwave to optical conversion schemes. Moreover the hybrid platform allows implementation of momentum biasing, which allows realization of on chip non-reciprocal devices such as isolators or circulators and topological photonic bandstructures.

preprint2016arXiv

Building a comprehensive syntactic and semantic corpus of Chinese clinical texts

Objective: To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. Materials and methods: An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. Results: The syntactic corpus consists of 138 Chinese clinical documents with 47,424 tokens and 2553 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7695 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. Discussion: The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. Conclusions: In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain.

preprint2016arXiv

CT Image Reconstruction by Spatial-Radon Domain Data-Driven Tight Frame Regularization

This paper proposes a spatial-Radon domain CT image reconstruction model based on data-driven tight frames (SRD-DDTF). The proposed SRD-DDTF model combines the idea of joint image and Radon domain inpainting model of \cite{Dong2013X} and that of the data-driven tight frames for image denoising \cite{cai2014data}. It is different from existing models in that both CT image and its corresponding high quality projection image are reconstructed simultaneously using sparsity priors by tight frames that are adaptively learned from the data to provide optimal sparse approximations. An alternative minimization algorithm is designed to solve the proposed model which is nonsmooth and nonconvex. Convergence analysis of the algorithm is provided. Numerical experiments showed that the SRD-DDTF model is superior to the model by \cite{Dong2013X} especially in recovering some subtle structures in the images.

preprint2016arXiv

Image Restoration: A General Wavelet Frame Based Model and Its Asymptotic Analysis

Image restoration is one of the most important areas in imaging science. Mathematical tools have been widely used in image restoration, where wavelet frame based approach is one of the successful examples. In this paper, we introduce a generic wavelet frame based image restoration model, called the "general model", which includes most of the existing wavelet frame based models as special cases. Moreover, the general model also includes examples that are new to the literature. Motivated by our earlier studies [1-3], We provide an asymptotic analysis of the general model as image resolution goes to infinity, which establishes a connection between the general model in discrete setting and a new variatonal model in continuum setting. The variational model also includes some of the existing variational models as special cases, such as the total generalized variational model proposed by [4]. In the end, we introduce an algorithm solving the general model and present one numerical simulation as an example.

preprint2016arXiv

Jiamusi Pulsar Observations: I. Abnormal emission events of PSR B0919+06

PSR B0919+06 generally radiates radio pulses in a normal phase range. It has been known for its occasional perplexing abnormal emission events wherein individual pulses come to an earlier phase range for a few tens of periods and then returns to its usual phase. Heretofore, only a few such events have been available for study. We observed PSR B0919+06 for about 30 hours using the Jiamusi 66-m telescope at Jiamusi Deep Space Station at S-band, and detected 92 abnormal emission events. We identify four types of events based on the abrupted or gradual phase-shifting of individual pulses. The abnormal emission events are seen to occur randomly some every 1000 to 3000 periods, and they affect the leading edge of the mean profile by up to 2\% in amplitude. The abnormal emission events are probably related to gradual changes of emission processing in the pulsar magnetosphere.

preprint2016arXiv

Limited Tomography Reconstruction via Tight Frame and Sinogram Extrapolation

X-ray computed tomography (CT) is one of widely used diagnostic tools for medical and dental tomographic imaging of the human body. However, the standard filtered backprojection reconstruction method requires the complete knowledge of the projection data. In the case of limited data, the inverse problem of CT becomes more ill-posed, which makes the reconstructed image deteriorated by the artifacts. In this paper, we consider two dimensional CT reconstruction using the horizontally truncated projections. Over the decades, the numerous results including the sparsity model based approach has enabled the reconstruction of the image inside the region of interest (ROI) from the limited knowledge of the data. However, unlike these existing methods, we try to reconstruct the entire CT image from the limited knowledge of the sinogram via the tight frame regularization and the simultaneous sinogram extrapolation. Our proposed model shows more promising numerical simulation results compared with the existing sparsity model based approach.

preprint2016arXiv

Randomized Algorithms For High Quality Treatment Planning in Volumetric Modulated Arc Therapy

In recent years, volumetric modulated arc therapy (VMAT) has been becoming a more and more important radiation technique widely used in clinical application for cancer treatment. One of the key problems in VMAT is treatment plan optimization, which is complicated due to the constraints imposed by the involved equipments. In this paper, we consider a model with four major constraints: the bound on the beam intensity, an upper bound on the rate of the change of the beam intensity, the moving speed of leaves of the multi-leaf collimator (MLC) and its directional-convexity. We solve the model by a two-stage algorithm: performing minimization with respect to the shapes of the aperture and the beam intensities alternatively. Specifically, the shapes of the aperture are obtained by a greedy algorithm whose performance is enhanced by random sampling in the leaf pairs with a decremental rate. The beam intensity is optimized using a gradient projection method with non-monotonic line search. We further improve the proposed algorithm by an incremental random importance sampling of the voxels to reduce the computational cost of the energy functional. Numerical simulations on two clinical cancer date sets demonstrate that our method is highly competitive to the state-of-the-art algorithms in terms of both computational time and quality of treatment planning.

preprint2015arXiv

Sparse Representation on Graphs by Tight Wavelet Frames and Applications

In this paper, we introduce a new (constructive) characterization of tight wavelet frames on non-flat domains in both continuum setting, i.e. on manifolds, and discrete setting, i.e. on graphs; discuss how fast tight wavelet frame transforms can be computed and how they can be effectively used to process graph data. We start with defining the quasi-affine systems on a given manifold $\cM$ that is formed by generalized dilations and shifts of a finite collection of wavelet functions $Ψ:=\{ψ_j: 1\le j\le r\}\subset L_2(\R)$. We further require that $ψ_j$ is generated by some refinable function $ϕ$ with mask $a_j$. We present the condition needed for the masks $\{a_j: 0\le j\le r\}$ so that the associated quasi-affine system generated by $Ψ$ is a tight frame for $L_2(\cM)$. Then, we discuss how the transition from the continuum (manifolds) to the discrete setting (graphs) can be naturally done. In order for the proposed discrete tight wavelet frame transforms to be useful in applications, we show how the transforms can be computed efficiently and accurately by proposing the fast tight wavelet frame transforms for graph data (WFTG). Finally, we consider two specific applications of the proposed WFTG: graph data denoising and semi-supervised clustering. Utilizing the sparse representation provided by the WFTG, we propose $\ell_1$-norm based optimization models on graphs for denoising and semi-supervised clustering. On one hand, our numerical results show significant advantage of the WFTG over the spectral graph wavelet transform (SGWT) by [1] for both applications. On the other hand, numerical experiments on two real data sets show that the proposed semi-supervised clustering model using the WFTG is overall competitive with the state-of-the-art methods developed in the literature of high-dimensional data classification, and is superior to some of these methods.

preprint2014arXiv

Sparsifying the Fisher Linear Discriminant by Rotation

Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis (LDA). To efficiently use them, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the needed sparsity. In this paper, we propose a family of rotations to create the required sparsity. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and to then apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers, and is robust against the sparsity level of the true model. We show that these rotations do create the sparsity needed for high dimensional classifications and provide theoretical understanding why such a rotation works empirically. The effectiveness of the proposed method is demonstrated by a number of simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.

preprint2013arXiv

A Contour-Guided Deformable Image Registration Algorithm for Adaptive Radiotherapy

In adaptive radiotherapy, deformable image registration is often conducted between the planning CT and treatment CT (or cone beam CT) to generate a deformation vector field (DVF) for dose accumulation and contour propagation. The auto propagated contours on the treatment CT may contain relatively large errors, especially in low contrast regions. A clinician inspection and editing of the propagated contours are frequently needed. The edited contours are able to meet the clinical requirement for adaptive therapy; however, the DVF is still inaccurate and inconsistent with the edited contours. The purpose of this work is to develop a contour-guided deformable image registration (CG-DIR) algorithm to improve the accuracy and consistency of the DVF for adaptive radiotherapy. Incorporation of the edited contours into the registration algorithm is realized by regularizing the objective function of the original demons algorithm with a term of intensity matching between the delineated structures set pairs. The CG-DIR algorithm is implemented on computer graphics processing units (GPUs) by following the original GPU-based demons algorithm computation framework [Gu et al, Phys Med Biol. 55(1): 207-219, 2010]. The performance of CG-DIR is evaluated on five clinical head-and-neck and one pelvic cancer patient data. It is found that compared with the original demons, CG-DIR improves the accuracy and consistency of the DVF, while retaining similar high computational efficiency.

preprint2012arXiv

Optimal Surface Marker Locations for Tumor Motion Estimation in Lung Cancer Radiotherapy

Using fiducial markers on patient's body surface to predict the tumor location is a widely used approach in lung cancer radiotherapy. The purpose of this work is to propose an algorithm that automatically identifies a sparse set of locations on the patient's surface with the optimal prediction power for the tumor motion. The sparse selection of markers on the external surface and the assumed linear relationship between the marker motion and the internal tumor motion are represented by a prediction matrix. Such a matrix is determined by solving an optimization problem, where the objective function contains a sparsity term that penalizes the number of markers chosen on the patient's surface. The performance of our algorithm has been tested on realistic clinical data of four lung cancer patients. Thoracic 4DCT scans with 10 phases are used for the study. On a reference phase, a grid of points are casted on the patient's surface (except for patient's back) and propagated to other phases via deformable image registration of the corresponding CT images. Tumor locations at each phase are also manually delineated. We use 9 out of 10 phases of the 4DCT images to identify a small group of surface markers that are most correlated with the motion of the tumor, and find the prediction matrix at the same time. The 10th phase is then used to test the accuracy of the prediction. It is found that on average 6 to 7 surface markers are necessary to predict tumor locations with a 3D error of about 1mm. In addition, the selected marker locations lie closely in those areas where surface point motion has a high correlation with the tumor motion. Our method can automatically select sparse locations on patient's external surface and estimate a correlation matrix based on 4DCT, so that the selected surface locations can be used to place fiducial markers to optimally predict internal tumor motions.

preprint2011arXiv

$\ell_0$ Minimization for Wavelet Frame Based Image Restoration

The theory of (tight) wavelet frames has been extensively studied in the past twenty years and they are currently widely used for image restoration and other image processing and analysis problems. The success of wavelet frame based models, including balanced approach and analysis based approach, is due to their capability of sparsely approximating piecewise smooth functions like images. Motivated by the balanced approach and analysis based approach, we shall propose a wavelet frame based $\ell_0$ minimization model, where the $\ell_0$ "norm" of the frame coefficients is penalized. We adapt the penalty decomposition (PD) method to solve the proposed optimization problem. Numerical results showed that the proposed model solved by the PD method can generate images with better quality than those obtained by either analysis based approach or balanced approach in terms of restoring sharp features as well as maintaining smoothness of the recovered images. Some convergence analysis of the PD method will also be provided.

preprint2011arXiv

A nonlinear PDE-based method for sparse deconvolution

In this paper, we introduce a new nonlinear evolution partial differential equation for sparse deconvolution problems. The proposed PDE has the form of continuity equation that arises in various research areas, e.g. fluid dynamics and optimal transportation, and thus has some interesting physical and geometric interpretations. The underlying optimization model that we consider is the standard $\ell_1$ minimization with linear equality constraints, i.e. $\min_u\{\|u\|_1 : Au=f\}$ with $A$ being an under-sampled convolution operator. We show that our PDE preserves the $\ell_1$ norm while lowering the residual $\|Au-f\|_2$. More importantly the solution of the PDE becomes sparser asymptotically, which is illustrated numerically. Therefore, it can be treated as a natural and helpful plug-in to some algorithms for $\ell_1$ minimization problems, e.g. Bregman iterative methods introduced for sparse reconstruction problems in [W. Yin, S. Osher, D. Goldfarb, and J. Darbon,SIAM J. Imaging Sci., 1 (2008), pp. 143-168]. Numerical experiments show great improvements in terms of both convergence speed and reconstruction quality.

preprint2011arXiv

Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising

We propose and analyze an extremely fast, efficient, and simple method for solving the problem:min{parallel to u parallel to(1) : Au = f, u is an element of R-n}.This method was first described in [J. Darbon and S. Osher, preprint, 2007], with more details in [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences, 1(1), 143-168, 2008] and rigorous theory given in [J. Cai, S. Osher and Z. Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report 08-06] and [J. Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008]. The motivation was compressive sensing, which now has a vast and exciting history, which seems to have started with Candes, et. al. [E. Candes, J. Romberg and T. Tao, 52(2), 489-509, 2006] and Donoho, [D. L. Donoho, IEEE Trans. Inform. Theory, 52, 1289-1306, 2006]. See [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences 1(1), 143-168, 2008] and [J. Cai, S. Osher and Z. Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report, 08-06] and [J. Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008] for a large set of references. Our method introduces an improvement called "kicking" of the very efficient method of [J. Darbon and S. Osher, preprint, 2007] and [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences, 1(1), 143-168, 2008] and also applies it to the problem of denoising of undersampled signals. The use of Bregman iteration for denoising of images began in [S. Osher, M. Burger, D. Goldfarb, J. Xu and W. Yin, Multiscale Model. Simul, 4(2), 460-489, 2005] and led to improved results for total variation based methods. Here we apply it to denoise signals, especially essentially sparse signals, which might even be undersampled.

preprint2011arXiv

GPU-based Iterative Cone Beam CT Reconstruction Using Tight Frame Regularization

X-ray imaging dose from serial cone-beam CT (CBCT) scans raises a clinical concern in most image guided radiation therapy procedures. It is the goal of this paper to develop a fast GPU-based algorithm to reconstruct high quality CBCT images from undersampled and noisy projection data so as to lower the imaging dose. For this purpose, we have developed an iterative tight frame (TF) based CBCT reconstruction algorithm. A condition that a real CBCT image has a sparse representation under a TF basis is imposed in the iteration process as regularization to the solution. To speed up the computation, a multi-grid method is employed. Our GPU implementation has achieved high computational efficiency and a CBCT image of resolution 512\times512\times70 can be reconstructed in ~5 min. We have tested our algorithm on a digital NCAT phantom and a physical Catphan phantom. It is found that our TF-based algorithm is able to reconstrct CBCT in the context of undersampling and low mAs levels. We have also quantitatively analyzed the reconstructed CBCT image quality in terms of modulation-transfer-function and contrast-to-noise ratio under various scanning conditions. The results confirm the high CBCT image quality obtained from our TF algorithm. Moreover, our algorithm has also been validated in a real clinical context using a head-and-neck patient case. Comparisons of the developed TF algorithm and the current state-of-the-art TV algorithm have also been made in various cases studied in terms of reconstructed image quality and computation efficiency.

preprint2009arXiv

A New Multiscale Representation for Shapes and Its Application to Blood Vessel Recovery

In this paper, we will first introduce a novel multiscale representation (MSR) for shapes. Based on the MSR, we will then design a surface inpainting algorithm to recover 3D geometry of blood vessels. Because of the nature of irregular morphology in vessels and organs, both phantom and real inpainting scenarios were tested using our new algorithm. Successful vessel recoveries are demonstrated with numerical estimation of the degree of arteriosclerosis and vessel occlusion.

Bin Dong

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

Hilbert-Geo: Solving Solid Geometric Problems by Neural-Symbolic Reasoning

Analysis of a wavelet frame based two-scale model for enhanced edges

A Note on Machine Learning Approach for Computational Imaging

A scalable deep learning approach for solving high-dimensional dynamic optimal transport

Learning invariance preserving moment closure model for Boltzmann-BGK equation

MOTR: End-to-End Multiple-Object Tracking with Transformer

Region-Aware Metric Learning for Open World Semantic Segmentation via Meta-Channel Aggregation

Trained Model in Supervised Deep Learning is a Conditional Risk Minimizer

A Practical Layer-Parallel Training Algorithm for Residual Networks

Enhancing Certified Robustness via Smoothed Weighted Ensembling

NPTC-net: Narrow-Band Parallel Transport Convolutional Neural Network on Point Clouds

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

Blind Adversarial Training: Balance Accuracy and Robustness

MetaInv-Net: Meta Inversion Network for Sparse View CT Image Reconstruction

Quantization of electromagnetic modes and angular momentum on plasmonic nanowires

RODE-Net: Learning Ordinary Differential Equations with Randomness from Data

Transferred Discrepancy: Quantifying the Difference Between Representations

Hybrid Integrated Photonics Using Bulk Acoustic Resonators

Building a comprehensive syntactic and semantic corpus of Chinese clinical texts

CT Image Reconstruction by Spatial-Radon Domain Data-Driven Tight Frame Regularization

Image Restoration: A General Wavelet Frame Based Model and Its Asymptotic Analysis

Jiamusi Pulsar Observations: I. Abnormal emission events of PSR B0919+06

Limited Tomography Reconstruction via Tight Frame and Sinogram Extrapolation

Randomized Algorithms For High Quality Treatment Planning in Volumetric Modulated Arc Therapy

Sparse Representation on Graphs by Tight Wavelet Frames and Applications

Sparsifying the Fisher Linear Discriminant by Rotation

A Contour-Guided Deformable Image Registration Algorithm for Adaptive Radiotherapy

Optimal Surface Marker Locations for Tumor Motion Estimation in Lung Cancer Radiotherapy

$\ell_0$ Minimization for Wavelet Frame Based Image Restoration

A nonlinear PDE-based method for sparse deconvolution

Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising

GPU-based Iterative Cone Beam CT Reconstruction Using Tight Frame Regularization

A New Multiscale Representation for Shapes and Its Application to Blood Vessel Recovery