Researcher profile

Zhiyuan Ma

Zhiyuan Ma contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

PEARLS: 21 Transients Found in the Three-Epoch NIRCam Observations in the Continuous Viewing Zone of the James Webb Space Telescope

We present 21 transients from our three-epoch, four-band NIRCam observations covering 14.16 arcmin^2 in the Spitzer IRAC Dark Field (IDF), taken by the JWST Prime Extragalactic Areas for Reionization and Lensing Science program with a time cadence of ~6 months. A separate Hubble Space Telescope program provided Advanced Camera for Surveys optical imaging contemporaneous with the second and third epochs of the NIRCam observations. The NIRSpec spectroscopy on three transients confirmed a Type Ia supernova at z=1.63 and the host galaxies of the other two at z=2.64 and 1.90, respectively. Combining these with the photometric redshifts (z_ph) of the host galaxies in the rest of the sample, we find that the transients are in either a &#34;mid-z&#34; group at z>1.6 with M_V < -16.0 mag or a &#34;low-z&#34; group at z < 0.4 with M_H > -14.0 mag. The mid-z transients are consistent with supernovae. In contrast, the low-z transients&#39; luminosities fall in the range of the so-called &#34;gap transients&#34; between supernovae and novae. However, this latter conclusion is only tentative due to possible catastrophic failures in z_ph that could bias them to low-z. Conversely, if they are indeed at z < 0.4, it would be worth studying similar transients in the future. Our work further demonstrates the power of NIRCam in transient science and also shows that it would be more fruitful to carry out a long-term monitoring program with more passbands, a higher cadence and prompt follw-up spectroscopy. Being in the continuous viewing zone of the JWST, the IDF is an ideal field for this purpose.

preprint2026arXiv

PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering

Answering real-world open-domain multi-hop questions over massive corpora is a critical challenge in Retrieval-Augmented Generation (RAG) systems. Recent research employs reinforcement learning (RL) to end-to-end optimize the retrieval-augmented reasoning process, directly enhancing its capacity to resolve complex queries. However, reliable deployment is hindered by two obstacles. 1) Retrieval Collapse: iterative retrieval over large corpora fails to locate intermediate evidence containing bridge answers without reasoning-guided planning, causing downstream reasoning to collapse. 2) Learning Instability: end-to-end trajectory training suffers from weak credit assignment across reasoning chains and poor error localization across modules, causing overfitting to benchmark-specific heuristics that limit transferability and stability. To address these problems, we propose PRISMA, a decoupled RL-guided framework featuring a Plan-Retrieve-Inspect-Solve-Memoize architecture. PRISMA&#39;s strength lies in reasoning-guided collaboration: the Inspector provides reasoning-based feedback to refine the Planner&#39;s decomposition and fine-grained retrieval, while enforcing evidence-grounded reasoning in the Solver. We optimize individual agent capabilities via Two-Stage Group Relative Policy Optimization (GRPO). Stage I calibrates the Planner and Solver as specialized experts in planning and reasoning, while Stage II utilizes Observation-Aware Residual Policy Optimization (OARPO) to enhance the Inspector&#39;s ability to verify context and trigger targeted recovery. Experiments show that PRISMA achieves state-of-the-art performance on ten benchmarks and can be deployed efficiently in real-world scenarios.

preprint2026arXiv

Qwen-Image-2.0 Technical Report

We present Qwen-Image-2.0, an omni-capable image generation foundation model that unifies high-fidelity generation and precise image editing within a single framework. Despite recent progress, existing models still struggle with ultra-long text rendering, multilingual typography, high-resolution photorealism, robust instruction following, and efficient deployment, especially in text-rich and compositionally complex scenarios. Qwen-Image-2.0 addresses these challenges by coupling Qwen3-VL as the condition encoder with a Multimodal Diffusion Transformer for joint condition-target modeling, supported by large-scale data curation and a customized multi-stage training pipeline. This enables strong multimodal understanding while preserving flexible generation and editing capabilities. The model supports instructions of up to 1K tokens for generating text-rich content such as slides, posters, infographics, and comics, while significantly improving multilingual text fidelity and typography. It also enhances photorealistic generation with richer details, more realistic textures, and coherent lighting, and follows complex prompts more reliably across diverse styles. Extensive human evaluations show that Qwen-Image-2.0 substantially outperforms previous Qwen-Image models in both generation and editing, marking a step toward more general, reliable, and practical image generation foundation models.

preprint2026arXiv

TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment

Reinforcement learning (RL) has shown extraordinary potential in aligning diffusion models to downstream tasks, yet most of them still suffer from significant reward hacking, which degrades generative diversity and quality by inducing visual mode collapse and amplifying unreliable rewards. We identify the root cause as the mode-seeking nature of these methods, which maximize expected reward without effectively constraining probability distribution over acceptable trajectories, causing concentration on a few high-reward paths. In contrast, we propose Trajectory Matching Policy Optimization (TMPO), which replaces scalar reward maximization with trajectory-level reward distribution matching. Specifically, TMPO introduces a Softmax Trajectory Balance (Softmax-TB) objective to match the policy probabilities of K trajectories to a reward-induced Boltzmann distribution. We prove that this objective inherits the mode-covering property of forward KL divergence, preserving coverage over all acceptable trajectories while optimizing reward. To further reduce multi-trajectory training time on large-scale flow-matching models, TMPO incorporates Dynamic Stochastic Tree Sampling, where trajectories share denoising prefixes and branch at dynamically scheduled steps, reducing redundant computation while improving training effectiveness. Extensive results across diverse alignment tasks such as human preference, compositional generation and text rendering show that TMPO improves generative diversity over state-of-the-art methods by 9.1%, and achieves competitive performance in all downstream and efficiency metrics, attaining the optimal trade-off between reward and diversity.

preprint2022arXiv

Counterparts of Candidate Dusty Starbursts at z > 6

We present an analysis of the optical-to-near-IR counterparts of a sample of candidate dusty starbursts at z > 6. These objects were pre-selected based on the rising trend of their far-infrared-to-sub-millimeter spectral energy distributions and the fact that they are radio-weak. Their precise positions are available through millimeter and/or radio interferometry, which enable us to search for their counterparts in the deep optical-to-near-IR images. The sample include five z > 6 candidates. Three of them have their counterparts identified, one is still invisible in the deepest images, and one is a known galaxy at z = 5.667 that is completely blocked by a foreground galaxy. The three with counterparts identified are analyzed using population systhesis model, and they have photometric redshift solutions ranging from 7.5 to 9.0. Assuming that they are indeed at these redshifts and that they are not gravitationally lensed, their total IR luminosities are 10^{13.8-14.1} L_sun and the inferred star formation rates are 6.3--13 x 10^3 M_sun/yr. The existence of dusty starbursts at such redshifts would imply that the universe must be forming stars intensely very early in time in at least some galaxies, otherwise there would not be enough dust to produce the descendants observed at these redshifts. The inferred host galaxy stellar masses of these three objects, which are at >~ 10^{11} M_sun (if not affected by gravitational lensing), present a difficulty in explanation unless we are willing to accept that their progenitors either kept forming stars at a rate of >~ 10^3 M_sun/yr or were formed through intense instantaneous bursts. Spectroscopic confirmation of such objects will be imperative.

preprint2022arXiv

First Batch of Candidate Galaxies at Redshifts 11 to 20 Revealed by the James Webb Space Telescope Early Release Observations

On July 13, 2022, NASA released to the whole world the data obtained by the James Webb Space Telescope (JWST) Early Release Observations (ERO). These are the first set of science-grade data from this long-awaited facility, marking the beginning of a new era in astronomy. In the study of the early universe, JWST will allow us to push far beyond z ~ 11, the redshift boundary previously imposed by the 1.7 um red cut-off of the Hubble Space Telescope (HST). In contrast, JWST&#39;s NIRCam reaches 5 um. Among the JWST ERO targets there is a nearby galaxy cluster SMACS 0723-73, which is a massive cluster and has been long recognized as a potential &#34;cosmic telescope&#34; in amplifying background galaxies. The ERO six-band NIRCam observations on this target have covered an additional flanking field not boosted by gravitational lensing, which also sees far beyond HST. Here we report the result from our search of candidate objects at z > 11 using these ERO data. In total, there are 87 such objects identified by using the standard &#34;dropout&#34; technique. These objects are all detected in multiple bands and therefore cannot be spurious. For most of them, their multi-band colors are inconsistent with known types of contaminants. If the detected dropout signature is interpreted as the expected Lyman-break, it implies that these objects are at z ~ 11--20. The large number of such candidate objects at such high redshifts is not expected from the previously favored predictions and demands further investigations. JWST spectroscopy on such objects will be critical.

preprint2022arXiv

JWST&#39;s PEARLS: A JWST/NIRCam view of ALMA sources

We report the results of James Webb Space Telescope/NIRCam observations of 19 (sub)millimeter (submm/mm) sources detected by the Atacama Large Millimeter Array (ALMA). The accurate ALMA positions allowed unambiguous identifications of their NIRCam counterparts. Taking gravitational lensing into account, these represent 16 distinct galaxies in three fields and constitute the largest sample of its kind to date. The counterparts&#39; spectral energy distributions from rest-frame ultraviolet to near infrared provide photometric redshifts ($1<z<4.5$) and stellar masses ($M_*>10^{10.5}$ Msol), which are similar to sub-millimeter galaxy (SMG) hosts studied previously. However, our sample is fainter in submm/mm than the classic SMG samples are, and our sources exhibit a wider range of properties. They have dust-embedded star-formation rates as low as 10 Msol yr$^{-1}$, and the sources populate both the star-forming main sequence and the quiescent categories. The deep NIRCam data allow us to study the rest-frame near-IR morphologies. Excluding two multiply imaged systems and one quasar, the majority of the remaining sources are disk-like and show either little or no disturbance. This suggests that secular growth is a potential route for the assembly of high-mass disk galaxies. While a few hosts have large disks, the majority have small disks (median half-mass radius of 1.6 kpc). At this time, it is unclear whether this is due to the prevalence of small disks at these redshifts or some unknown selection effects of deep ALMA observations. A larger sample of ALMA sources with NIRCam observations will be able to address this question.

preprint2022arXiv

The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5.0) of the Core Package

The Astropy Project supports and fosters the development of open-source and openly-developed Python packages that provide commonly needed functionality to the astronomical community. A key element of the Astropy Project is the core package $\texttt{astropy}$, which serves as the foundation for more specialized projects and packages. In this article, we summarize key features in the core package as of the recent major release, version 5.0, and provide major updates for the Project. We then discuss supporting a broader ecosystem of interoperable packages, including connections with several astronomical observatories and missions. We also revisit the future outlook of the Astropy Project and the current status of Learn Astropy. We conclude by raising and discussing the current and future challenges facing the Project.

preprint2022arXiv

Webb&#39;s PEARLS: Bright 1.5--2.0 micron Dropouts in the Spitzer/IRAC Dark Field

Using the first epoch of four-band NIRCam observations obtained by the James Webb Space Telescope (JWST) Prime Extragalactic Areas for Reionization and Lensing Science Program in the Spitzer IRAC Dark Field, we search for F150W and F200W dropouts. In 14.2 arcmin^2, we have found eight F150W dropouts and eight F200W dropouts, all brighter than 27.5 mag (the brightest being ~24 mag) in the band to the red side of the break. As they are detected in multiple bands, these must be real objects. Their nature, however, is unclear, and characterizing their properties is important for realizing the full potential of JWST. If the observed color decrements are due to the Lyman break, these objects should be at z >~ 11.7 and z >~ 15.4, respectively. The color diagnostics show that at least four F150W dropouts are far away from the usual contaminators encountered in dropout searches (red galaxies at much lower redshifts or brown dwarf stars). While the diagnostics of the F200W dropouts are less certain due to the limited number of passbands, at least one of them is likely not a known type of contaminant, and the rest are consistent with either high-redshift galaxies with evolved stellar populations or old galaxies at z ~ 3 to 8. If a significant fraction of our dropouts are indeed at z ~ 12, we have to face the severe problem of explaining their high luminosities and number densities. Spectroscopic identifications of such objects are urgently needed.

preprint2020arXiv

A Systematic Search for Reddest Far-infrared and Sub-millimeter Galaxies: revealing dust-embedded starbursts at high redshifts

We present the results of our systematic search for the reddest far-infrared (FIR) and submillimeter (sub-mm) galaxies using the data from the Herschel Multi-tiered Extragalactic Survey (HerMES) and the SCUBA2 Cosmological Legacy Survey (S2CLS). The red FIR galaxies are &#34;500~$μ$m risers,&#34; whose spectral energy distributions increase with wavelength across the three FIR passbands of the Spectral and Photometric Imaging REceiver (SPIRE) of Herschel. Within 106.5 deg$^2$ of the HerMES fields, we have selected 629 500 $μ$m risers. The red sub-mm galaxies are &#34;SPIRE dropouts,&#34; which are prominent detections in the S2CLS 850 $μ$m data but are extremely weak or invisible in the SPIRE bands. Within the 2.98 deg$^2$ common area of HerMES and S2CLS, we have selected 95 such objects. These very red sources could be dusty starbursts at high redshifts ($z\gtrsim 4$-6) because the peak of their cold-dust emission heated by star formation is shifted to the reddest FIR/sub-mm bands. The surface density of 500 $μ$m risers is $\sim$8.2 deg$^{-2}$ at the $\geq 20$ mJy level in 500 $μ$m, while that of SPIRE dropouts is $\sim$19.3 deg$^{-2}$ at the $\geq 5$ mJy level in 850 $μ$m. Both type of objects could span a wide range of redshifts, however. Using deep radio data in these fields to further select the ones likely at the highest redshifts, we find that the surface density of $z>6$ candidates is 5.5 deg$^{-2}$ among 500 $μ$m risers and is 0.8-13.6 deg$^{-2}$ among SPIRE dropouts. If this is correct, the dust-embedded star formation processes in such objects could contribute comparably as Lyman-break galaxies to the global SFR density at $z>6$.

preprint2020arXiv

Distributed Generalized Nash Equilibrium Seeking for Energy Sharing Games

With the proliferation of distributed generators and energy storage systems, traditional passive consumers in power systems have been gradually evolving into the so-called &#34;prosumers&#34;, i.e., proactive consumers, which can both produce and consume power. To encourage energy exchange among prosumers, energy sharing is increasingly adopted, which is usually formulated as a generalized Nash game (GNG). In this paper, a distributed approach is proposed to seek the Generalized Nash equilibrium (GNE) of the energy sharing game. To this end, we convert the GNG into an equivalent optimization problem. A Krasnosel&#39;ski{ǐ}-Mann iteration type algorithm is thereby devised to solve the problem and consequently find the GNE in a distributed manner. The convergence of the proposed algorithm is proved rigorously based on the nonexpansive operator theory. The performance of the algorithm is validated by experiments with three prosumers, and the scalability is tested by simulations using 123 prosumers.

preprint2020arXiv

NE-LP: Normalized Entropy and Loss Prediction based Sampling for Active Learning in Chinese Word Segmentation on EHRs

Electronic Health Records (EHRs) in hospital information systems contain patients&#39; diagnosis and treatments, so EHRs are essential to clinical data mining. Of all the tasks in the mining process, Chinese Word Segmentation (CWS) is a fundamental and important one, and most state-of-the-art methods greatly rely on large-scale of manually-annotated data. Since annotation is time-consuming and expensive, efforts have been devoted to techniques, such as active learning, to locate the most informative samples for modeling. In this paper, we follow the trend and present an active learning method for CWS in EHRs. Specically, a new sampling strategy combining Normalized Entropy with Loss Prediction (NE-LP) is proposed to select the most representative data. Meanwhile, to minimize the computational cost of learning, we propose a joint model including a word segmenter and a loss prediction model. Furthermore, to capture interactions between adjacent characters, bigram features are also applied in the joint model. To illustrate the effectiveness of NE-LP, we conducted experiments on EHRs collected from the Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine. The results demonstrate that NE-LP consistently outperforms conventional uncertainty-based sampling strategies for active learning in CWS.

preprint2020arXiv

Space Telescope and Optical Reverberation Mapping Project. IX. Velocity-Delay Maps for Broad Emission Lines in NGC 5548

We report velocity-delay maps for prominent broad emission lines, Ly_alpha, CIV, HeII and H_beta, in the spectrum of NGC5548. The emission-line responses inhabit the interior of a virial envelope. The velocity-delay maps reveal stratified ionization structure. The HeII response inside 5-10 light-days has a broad single-peaked velocity profile. The Ly_alpha, CIV, and H_beta responses peak inside 10 light-days, extend outside 20 light-days, and exhibit a velocity profile with two peaks separated by 5000 km/s in the 10 to 20 light-day delay range. The velocity-delay maps show that the M-shaped lag vs velocity structure found in previous cross-correlation analysis is the signature of a Keplerian disk with a well-defined outer edge at R=20 light-days. The outer wings of the M arise from the virial envelope, and the U-shaped interior of the M is the lower half of an ellipse in the velocity-delay plane. The far-side response is weaker than that from the near side, so that we see clearly the lower half, but only faintly the upper half, of the velocity--delay ellipse. The delay tau=(R/c)(1-sin(i))=5 light-days at line center is from the near edge of the inclined ring, giving the inclination i=45 deg. A black hole mass of M=7x10^7 Msun is consistent with the velocity-delay structure. A barber-pole pattern with stripes moving from red to blue across the CIV and possibly Ly_alpha line profiles suggests the presence of azimuthal structure rotating around the far side of the broad-line region and may be the signature of precession or orbital motion of structures in the inner disk. Further HST observations of NGC 5548 over a multi-year timespan but with a cadence of perhaps 10 days rather than 1 day could help to clarify the nature of this new AGN phenomenon.