Source author record

Heng Yu

Heng Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.CO Computation and Language astro-ph.GA astro-ph.HE astro-ph.SR Computer Vision eess.IV eess.SP Machine Learning physics.med-ph

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

GPU-Net: Lightweight U-Net with more diverse features

Image segmentation is an important task in the medical image field and many convolutional neural networks (CNNs) based methods have been proposed, among which U-Net and its variants show promising performance. In this paper, we propose GP-module and GPU-Net based on U-Net, which can learn more diverse features by introducing Ghost module and atrous spatial pyramid pooling (ASPP). Our method achieves better performance with more than 4 times fewer parameters and 2 times fewer FLOPs, which provides a new potential direction for future research. Our plug-and-play module can also be applied to existing segmentation methods to further improve their performance.

preprint2022arXiv

Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation

The principal task in supervised neural machine translation (NMT) is to learn to generate target sentences conditioned on the source inputs from a set of parallel sentence pairs, and thus produce a model capable of generalizing to unseen instances. However, it is commonly observed that the generalization performance of the model is highly influenced by the amount of parallel data used in training. Although data augmentation is widely used to enrich the training data, conventional methods with discrete manipulations fail to generate diverse and faithful training samples. In this paper, we present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT), which augments each training instance with an adjacency semantic region that could cover adequate variants of literal expression under the same meaning. We conduct extensive experiments on both rich-resource and low-resource settings involving various language pairs, including WMT14 English-{German,French}, NIST Chinese-English and multiple low-resource IWSLT translation tasks. The provided empirical evidences show that CsaNMT sets a new level of performance among existing augmentation techniques, improving on the state-of-the-art by a large margin. The core codes are contained in Appendix E.

preprint2022arXiv

Scan Specific Artifact Reduction in K-space (SPARK) Neural Networks Synergize with Physics-based Reconstruction to Accelerate MRI

Purpose: To develop a scan-specific model that estimates and corrects k-space errors made when reconstructing accelerated Magnetic Resonance Imaging (MRI) data. Methods: Scan-Specific Artifact Reduction in k-space (SPARK) trains a convolutional-neural-network to estimate and correct k-space errors made by an input reconstruction technique by back-propagating from the mean-squared-error loss between an auto-calibration signal (ACS) and the input technique's reconstructed ACS. First, SPARK is applied to GRAPPA and demonstrates improved robustness over other scan-specific models, such as RAKI and residual-RAKI. Subsequent experiments demonstrate that SPARK synergizes with residual-RAKI to improve reconstruction performance. SPARK also improves reconstruction quality when applied to advanced acquisition and reconstruction techniques like 2D virtual coil (VC-) GRAPPA, 2D LORAKS, 3D GRAPPA without an integrated ACS region, and 2D/3D wave-encoded images. Results: SPARK yields 1.5x - 2x RMSE reduction when applied to GRAPPA and improves robustness to ACS size for various acceleration rates in comparison to other scan-specific techniques. When applied to advanced reconstruction techniques such as residual-RAKI, 2D VC-GRAPPA and LORAKS, SPARK achieves up to 20% RMSE improvement. SPARK with 3D GRAPPA also improves performance by ~2x and perceived image quality without a fully sampled ACS region. Finally, SPARK synergizes with non-cartesian 2D and 3D wave-encoding imaging by reducing RMSE between 20-25% and providing qualitative improvements. Conclusion: SPARK synergizes with physics-based acquisition and reconstruction techniques to improve accelerated MRI by training scan-specific models to estimate and correct reconstruction errors in k-space.

preprint2020arXiv

AR: Auto-Repair the Synthetic Data for Neural Machine Translation

Compared with only using limited authentic parallel data as training corpus, many studies have proved that incorporating synthetic parallel data, which generated by back translation (BT) or forward translation (FT, or selftraining), into the NMT training process can significantly improve translation quality. However, as a well-known shortcoming, synthetic parallel data is noisy because they are generated by an imperfect NMT system. As a result, the improvements in translation quality bring by the synthetic parallel data are greatly diminished. In this paper, we propose a novel Auto- Repair (AR) framework to improve the quality of synthetic data. Our proposed AR model can learn the transformation from low quality (noisy) input sentence to high quality sentence based on large scale monolingual data with BT and FT techniques. The noise in synthetic parallel data will be sufficiently eliminated by the proposed AR model and then the repaired synthetic parallel data can help the NMT models to achieve larger improvements. Experimental results show that our approach can effective improve the quality of synthetic parallel data and the NMT model with the repaired synthetic data achieves consistent improvements on both WMT14 EN!DE and IWSLT14 DE!EN translation tasks.

preprint2020arXiv

Chandra and XMM-Newton observations of A2256: cold fronts, merger shocks, and constraint on the IC emission

We present the results of deep Chandra and XMM-Newton observations of a complex merging galaxy cluster Abell 2256 (A2256) that hosts a spectacular radio relic (RR). The temperature and metallicity maps show clear evidence of a merger between the western subcluster (SC) and the primary cluster (PC). We detect five X-ray surface brightness edges. Three of them near the cluster center are cold fronts (CFs): CF1 is associated with the infalling SC; CF2 is located in the east of the PC; and CF3 is to the west of the PC core. The other two edges at cluster outskirts are shock fronts (SFs): SF1 near the RR in the NW has Mach numbers derived from the temperature and the density jumps, respectively, of $M_T=1.62\pm0.12$ and $M_ρ=1.23\pm0.06$; SF2 in the SE has $M_T=1.54\pm0.05$ and $M_ρ=1.16\pm0.13$. In the region of the RR, there is no evidence for the correlation between X-ray and radio substructures, from which we estimate an upper limit for the inverse-Compton emission, and therefore set a lower limit on the magnetic field ($\sim$ 450 kpc from PC center) of $B>1.0\ μ$G for a single power-law electron spectrum or $B>0.4\ μ$G for a broken power-law electron spectrum. We propose a merger scenario including a PC, an SC, and a group. Our merger scenario accounts for the X-ray edges, diffuse radio features, and galaxy kinematics, as well as projection effects.

preprint2020arXiv

GRET: Global Representation Enhanced Transformer

Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks. The encoder maps the words in the input sentence into a sequence of hidden states, which are then fed into the decoder to generate the output sentence. These hidden states usually correspond to the input words and focus on capturing local information. However, the global (sentence level) information is seldom explored, leaving room for the improvement of generation quality. In this paper, we propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network. Specifically, in the proposed model, an external state is generated for the global representation from the encoder. The global representation is then fused into the decoder during the decoding process to improve generation quality. We conduct experiments in two text generation tasks: machine translation and text summarization. Experimental results on four WMT machine translation tasks and LCSTS text summarization task demonstrate the effectiveness of the proposed approach on natural language generation.

preprint2020arXiv

Multiscale Collaborative Deep Models for Neural Machine Translation

Recent evidence reveals that Neural Machine Translation (NMT) models with deeper neural networks can be more effective but are difficult to train. In this paper, we present a MultiScale Collaborative (MSC) framework to ease the training of NMT models that are substantially deeper than those used previously. We explicitly boost the gradient back-propagation from top to bottom levels by introducing a block-scale collaboration mechanism into deep NMT models. Then, instead of forcing the whole encoder stack directly learns a desired representation, we let each encoder block learns a fine-grained representation and enhance it by encoding spatial dependencies using a context-scale collaboration. We provide empirical evidence showing that the MSC nets are easy to optimize and can obtain improvements of translation quality from considerably increased depth. On IWSLT translation tasks with three translation directions, our extremely deep models (with 72-layer encoders) surpass strong baselines by +2.2~+3.1 BLEU points. In addition, our deep MSC achieves a BLEU score of 30.56 on WMT14 English-German task that significantly outperforms state-of-the-art deep NMT models.

preprint2020arXiv

Unveiling the Hierarchical Structure of Open Star Clusters: the Perseus Double Cluster

We introduce a new kinematic method to investigate the structure of open star clusters. We adopt a hierarchical clustering algorithm that uses the celestial coordinates and the proper motions of the stars in the field of view of the cluster to estimate a proxy of the pairwise binding energy of the stars and arrange them in a binary tree. The cluster substructures and their members are identified by trimming the tree at two thresholds, according to the $σ$-plateau method. Testing the algorithm on 100 mock catalogs shows that, on average, the membership of the identified clusters is $(91.5\pm 3.5)$\% complete and the fraction of unrelated stars is $(10.4\pm 2.0)$\%. We apply the algorithm to the stars in the field of view of the Perseus double cluster from the Data Release 2 of Gaia. This approach identifies a single structure, Sub1, that separates into two substructures, Sub1-1 and Sub1-2. These substructures coincide with $h$ Per and $χ$ Per: the distributions of the proper motions and the color-magnitude diagrams of the members of Sub1-1 and Sub1-2 are fully consistent with those of $h$ Per and $χ$ Per reported in the literature. These results suggest that our hierarchical clustering algorithm can be a powerful tool to unveil the complex kinematic information of star clusters.

preprint2016arXiv

Agreement-based Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora

We introduce an agreement-based approach to learning parallel lexicons and phrases from non-parallel corpora. The basic idea is to encourage two asymmetric latent-variable translation models (i.e., source-to-target and target-to-source) to agree on identifying latent phrase and word alignments. The agreement is defined at both word and phrase levels. We develop a Viterbi EM algorithm for jointly training the two unidirectional models efficiently. Experiments on the Chinese-English dataset show that agreement-based learning significantly improves both alignment and translation performance.

preprint2016arXiv

Searching for bulk motions in the ICM of massive, merging clusters with Chandra CCD data

We search for bulk motions in the intracluster medium (ICM) of massive clusters showing evidence of an ongoing or recent major merger with spatially resolved spectroscopy in {\sl Chandra} CCD data. We identify a sample of 6 merging clusters with $>$150 ks {\sl Chandra} exposure in the redshift range $0.1 < z < 0.3$. By performing X-ray spectral analysis of projected ICM regions selected according to their surface brightness, we obtain the projected redshift maps for all of these clusters. After performing a robust analysis of the statistical and systematic uncertainties in the measured X-ray redshift $z_{\rm X}$, we check whether or not the global $z_{\rm X}$ distribution differs from that expected when the ICM is at rest. We find evidence of significant bulk motions at more than 3$σ$ in A2142 and A115, and less than 2$σ$ in A2034 and A520. Focusing on single regions, we identify significant localized velocity differences in all of the merging clusters. We also perform the same analysis on two relaxed clusters with no signatures of recent mergers, finding no signs of bulk motions, as expected. Our results indicate that deep {\sl Chandra} CCD data enable us to identify the presence of bulk motions at the level of $v_{\rm BM} >$ 1000\ ${\rm km\ s^{-1}}$ in the ICM of massive merging clusters at $0.1<z<0.3$. Although the CCD spectral resolution is not sufficient for a detailed analysis of the ICM dynamics, {\sl Chandra} CCD data constitute a key diagnostic tool complementing X-ray bolometers on board future X-ray missions.

preprint2016arXiv

The unrelaxed dynamical structure of the galaxy cluster Abell 85

For the first time, we explore the dynamics of the central region of a galaxy cluster within $r_{500}\sim 600h^{-1}$~kpc from its center by combining optical and X-ray spectroscopy. We use (1) the caustic technique that identifies the cluster substructures and their galaxy members with optical spectroscopic data, and (2) the X-ray redshift fitting procedure that estimates the redshift distribution of the intracluster medium (ICM). We use the spatial and redshift distributions of the galaxies and of the X-ray emitting gas to associate the optical substructures to the X-ray regions. When we apply this approach to Abell 85 (A85), a complex dynamical structure of A85 emerges from our analysis: a galaxy group, with redshift $z=0.0509 \pm 0.0021$ is passing through the cluster center along the line of sight dragging part of the ICM present in the cluster core; two additional groups, at redshift $z=0.0547 \pm 0.0022$ and $z=0.0570 \pm 0.0020$, are going through the cluster in opposite directions, almost perpendicularly to the line of sight, and have substantially perturbed the dynamics of the ICM. An additional group in the outskirts of A85, at redshift $z=0.0561 \pm 0.0023$, is associated to a secondary peak of the X-ray emission, at redshift $z=0.0583^{+0.0039}_{-0.0047}$. Although our analysis and results on A85 need to be confirmed by high-resolution spectroscopy, they demonstrate how our new approach can be a powerful tool to constrain the formation history of galaxy clusters by unveiling their central and surrounding structures.

preprint2015arXiv

A method to search for bulk motions in the ICM with {\sl Chandra} CCD spectra: application to the Bullet cluster

We propose a strategy to search for bulk motions in the intracluster medium (ICM) of merging clusters based on {\sl Chandra} CCD data. Our goal is to derive robust measurements of the average redshift of projected ICM regions obtained from the centroid of the $K_α$ line emission. We thoroughly explore the effect of the unknown temperature structure along the line of sight to accurately evaluate the systematic uncertainties on the ICM redshift. We apply our method to the "Bullet cluster" (1E~0657-56). We directly identify 23 independent regions on the basis of the surface brightness contours, and measure the redshift of the ICM averaged along the line of sight in each. We find that the redshift distribution across these regions is marginally inconsistent with the null hypothesis of a constant redshift or no bulk motion in the ICM, at a confidence level of about $2\, σ$. We tentatively identify the regions most likely affected by bulk motions and find a maximum velocity gradient of about $(46\pm 13)$ $\rm km~s^{-1}~kpc^{-1}$ along the line of sight on a scale of $\sim 260 $ kpc along the path of the "bullet." We interpret this as the possible signature of a significant mass of ICM pushed away along a direction perpendicular to the merging. This preliminary result is promising for a systematic search for bulk motions in bright, moderate-redshift clusters based on spatially resolved spectral analysis of {\sl Chandra} CCD data. This preliminary result is promising for a systematic search for bulk motions in bright, moderate-redshift clusters based on spatially resolved spectral analysis of {\sl Chandra} CCD data.

preprint2015arXiv

Identification of galaxy cluster substructures with the Caustic method

We investigate the power of the caustic technique for identifying substructures of galaxy clusters from optical redshift data alone. The caustic technique is designed to estimate the mass profile of galaxy clusters to radii well beyond the virial radius, where dynamical equilibrium does not hold. Two by-products of this technique are the identification of the cluster members and the identification of the cluster substructures. We test the caustic technique as a substructure detector on two samples of 150 mock redshift surveys of clusters; the clusters are extracted from a large cosmological $N$-body simulation of a $Λ$CDM model and have masses of $M_{200} \sim 10^{14} h^{-1} M_{\odot}$ and $M_{200} \sim 10^{15} h^{-1} M_{\odot}$ in the two samples. We limit our analysis to substructures identified in the simulation with masses larger than $10^{13} h^{-1} M_{\odot}$. With mock redshift surveys with 200 galaxies within $3R_{200}$, (1) the caustic technique recovers $\sim 30-50$\% of the real substructures, and (2) $\sim 15-20$\% of the substructures identified by the caustic technique correspond to real substructures of the central cluster, the remaining fraction being low-mass substructures, groups or substructures of clusters in the surrounding region, or chance alignments of unrelated galaxies. These encouraging results show that the caustic technique is a promising approach for investigating the complex dynamics of galaxy clusters.

preprint2011arXiv

Measuring redshift through X-ray spectroscopy of galaxy clusters: results from Chandra data and future prospects

The ubiquitous presence of the Fe line complex in the X-ray spectra of galaxy clusters offers the possibility of measuring their redshift without resorting to spectroscopic follow-up observations. In this paper we assess the accuracy with which the redshift of galaxy clusters can be recovered from an X-ray spectral analysis of Chandra archival data. This study indicates a strategy to build large surveys of clusters whose identification and redshift measurement are both based on X-ray data alone. We apply a blind search for K--shell and L--shell Fe line complex in X-ray cluster spectra using Chandra archival observations of galaxy clusters. The Fe line in the ICM spectra can be detected by simply analyzing the C-statistics variation $ΔC_{stat}$ as a function of the redshift parameter. We repeat the measurement under different conditions, and compare the X-ray derived redshift $z_X$ with the one obtained by means of optical spectroscopy $z_o$. We explore how a number of priors on metallicity and luminosity can be effectively used to reduce catastrophic errors. The $ΔC_{stat}$ provides the most efficient means for discarding wrong redshift measures and to estimate the actual error on $z_X$. We identify a simple and efficient procedure for optimally measuring the redshifts from the X-ray spectral analysis of clusters of galaxies. When this procedure is applied to mock catalogs extracted from high sensitivity, wide-area cluster surveys, such as those proposed with Wide Field X-ray Telescope (WFXT) mission, it is possible to obtain a complete samples of X-ray clusters with reliable redshift measurements, thus avoiding time-consuming optical spectroscopic observations. This methodology will make it possible to trace cosmic growth by studying the evolution of the cluster mass function directly using X-ray data.

preprint2010arXiv

Combining Optical and X-ray Observations of Galaxy Clusters to Constrain Cosmological Parameters

Galaxy clusters have their unique advantages for cosmology. Here we collect a new sample of 10 lensing galaxy clusters with X-ray observations to constrain cosmological parameters.The redshifts of lensing clusters lie between 0.1 and 0.6, and the redshift range of their arcs is from 0.4 to 4.9. These clusters are selected carefully from strong gravitational lensing systems which have both X-ray satellite observations and optical giant luminous arcs with known redshift. Giant arcs usually appear in the central region of clusters, where mass can be traced with luminosity quite well. Based on gravitational lensing theory and cluster mass distribution model we can derive an Hubble constant independent ratio between two angular diameter distances. One is the distance of lensing source and the other is that between the deflector and the source. Since angular diameter distance relies heavily on cosmological geometry, we can use these ratios to constrain cosmological models. Meanwhile X-ray gas fractions of galaxy clusters can also be a cosmological probe. Because there are a dozen parameters to be fitted, we introduce a new analytic algorithm, Powell's UOBYQA (Unconstrained Optimization By Quadratic Approximation), to accelerate our calculation. Our result proves that this algorithm is an effective fitting method for such continuous multi-parameter constraint. We find an interesting fact that these two approaches are sensitive to $Ω_Λ$ and $Ω_{M}$ separately. Combining them we can get quite good fitting values of basic cosmological parameters: $Ω_{M}=0.26_{-0.04}^{+0.04}$, and $Ω_Λ=0.82_{-0.16}^{+0.14}$ .

Heng Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

GPU-Net: Lightweight U-Net with more diverse features

Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation

Scan Specific Artifact Reduction in K-space (SPARK) Neural Networks Synergize with Physics-based Reconstruction to Accelerate MRI

AR: Auto-Repair the Synthetic Data for Neural Machine Translation

Chandra and XMM-Newton observations of A2256: cold fronts, merger shocks, and constraint on the IC emission

GRET: Global Representation Enhanced Transformer

Multiscale Collaborative Deep Models for Neural Machine Translation

Unveiling the Hierarchical Structure of Open Star Clusters: the Perseus Double Cluster

Agreement-based Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora

Searching for bulk motions in the ICM of massive, merging clusters with Chandra CCD data

The unrelaxed dynamical structure of the galaxy cluster Abell 85

A method to search for bulk motions in the ICM with {\sl Chandra} CCD spectra: application to the Bullet cluster

Identification of galaxy cluster substructures with the Caustic method

Measuring redshift through X-ray spectroscopy of galaxy clusters: results from Chandra data and future prospects

Combining Optical and X-ray Observations of Galaxy Clusters to Constrain Cosmological Parameters