Researcher profile

Yujie Zhang

Yujie Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2025arXiv

RSAgent: Learning to Reason and Act for Text-Guided Segmentation via Multi-Turn Tool Invocations

Text-guided object segmentation requires both cross-modal reasoning and pixel grounding abilities. Most recent methods treat text-guided segmentation as one-shot grounding, where the model predicts pixel prompts in a single forward pass to drive an external segmentor, which limits verification, refocusing and refinement when initial localization is wrong. To address this limitation, we propose RSAgent, an agentic Multimodal Large Language Model (MLLM) which interleaves reasoning and action for segmentation via multi-turn tool invocations. RSAgent queries a segmentation toolbox, observes visual feedback, and revises its spatial hypothesis using historical observations to re-localize targets and iteratively refine masks. We further build a data pipeline to synthesize multi-turn reasoning segmentation trajectories, and train RSAgent with a two-stage framework: cold-start supervised fine-tuning followed by agentic reinforcement learning with fine-grained, task-specific rewards. Extensive experiments show that RSAgent achieves a zero-shot performance of 66.5% gIoU on ReasonSeg test, improving over Seg-Zero-7B by 9%, and reaches 81.5% cIoU on RefCOCOg, demonstrating state-of-the-art performance on both in-domain and out-of-domain benchmarks.

preprint2022arXiv

Building Multiple Access Channels with a Single Particle

A multiple access channel describes a situation in which multiple senders are trying to forward messages to a single receiver using some physical medium. In this paper we consider scenarios in which this medium consists of just a single classical or quantum particle. In the quantum case, the particle can be prepared in a superposition state thereby allowing for a richer family of encoding strategies. To make the comparison between quantum and classical channels precise, we introduce an operational framework in which all possible encoding strategies consume no more than a single particle. We apply this framework to an N-port interferometer experiment in which each party controls a path the particle can traverse. When used for the purpose of communication, this setup embodies a multiple access channel (MAC) built with a single particle. We provide a full characterization of the N-party classical MACs that can be built from a single particle, and we show that every non-classical particle can generate a MAC outside the classical set. To further distinguish the capabilities of a single classical and quantum particle, we relax the locality constraint and allow for joint encodings by subsets of 1<K<= N parties. This generates a richer family of classical MACs whose polytope dimension we compute. We identify a &#34;generalized fingerprinting inequality&#34; as a valid facet for this polytope, and we verify that a quantum particle distributed among N separated parties can violate this inequality even when K=N-1. Connections are drawn between the single-particle framework and multi-level coherence theory. We show that every pure state with K-level coherence can be detected in a semi-device independent manner, with the only assumption being conservation of particle number.

preprint2022arXiv

On-the-fly 3D metrology of volumetric additive manufacturing

Additive manufacturing techniques are revolutionizing product development by enabling fast turnaround from design to fabrication. However, the throughput of the rapid prototyping pipeline remains constrained by print optimization, requiring multiple iterations of fabrication and ex-situ metrology. Despite the need for a suitable technology, robust in-situ shape measurement of an entire print is not currently available with any additive manufacturing modality. Here, we address this shortcoming by demonstrating fully simultaneous 3D metrology and printing. We exploit the dramatic increase in light scattering by a photoresin during gelation for real-time 3D imaging of prints during tomographic volumetric additive manufacturing. Tomographic imaging of the light scattering density in the build volume yields quantitative, artifact-free 3D + time models of cured objects that are accurate to below 1% of the size of the print. By integrating shape measurement into the printing process, our work paves the way for next-generation rapid prototyping with real-time defect detection and correction.

preprint2022arXiv

Using Loaded N-port Structures to Achieve the Continuous-Space Electromagnetic Channel Capacity Bound

A method for achieving the continuous-space electromagnetic channel capacity bound using loaded N-port structures is described. It is relevant for the design of compact multiple-input multiple-output (MIMO) antennas that can achieve channel capacity bounds when constrained by size. The method is not restricted to a specific antenna configuration and a closed-form expression for the channel capacity limits are provided with various constraints. Furthermore, using loaded N-port structures to represent arbitrary antenna geometries, an efficient optimization approach is proposed for finding the optimum MIMO antenna design that achieves the channel capacity bounds. Simulation results of the channel capacity bounds achieved using our MIMO antenna design with one square wavelength size are provided. These show that at least 18 ports can be supported in one square wavelength and achieve the continuous-space electromagnetic channel capacity bound. The results demonstrate that our method can link continuous-space electromagnetic channel capacity bounds to MIMO antenna design.

preprint2020arXiv

Bi2Te3/Si thermophotovoltaic cells converting low temperature radiation into electricity

The thermophotovoltaic cells which convert the low temperature radiation into electricity are of significance due to their potential applications in many fields. In this work, Bi2Te3/Si thermophotovoltaic cells which work under the radiation from the blackbody with the temperature of 300 K-480 K are presented. The experimental results show that the cells can output electricity even under the radiation temperature of 300 K. The band structure of Bi2Te3/Si heterojunctions and the defects in Bi2Te3 thin films lower the conversion efficiency of the cells. It is also demonstrated that the resistivity of Si and the thickness of Bi2Te3 thin films have important effects on Bi2Te3/Si thermophotovoltaic cells. Although the cells&#39; output power is small, this work provides a possible way to utilize the low temperature radiation.

preprint2020arXiv

Robust Medical Instrument Segmentation Challenge 2019

Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. While numerous methods for detecting, segmenting and tracking of medical instruments based on endoscopic video images have been proposed in the literature, key limitations remain to be addressed: Firstly, robustness, that is, the reliable performance of state-of-the-art methods when run on challenging images (e.g. in the presence of blood, smoke or motion artifacts). Secondly, generalization; algorithms trained for a specific intervention in a specific hospital should generalize to other interventions or institutions. In an effort to promote solutions for these limitations, we organized the Robust Medical Instrument Segmentation (ROBUST-MIS) challenge as an international benchmarking competition with a specific focus on the robustness and generalization capabilities of algorithms. For the first time in the field of endoscopic image processing, our challenge included a task on binary segmentation and also addressed multi-instance detection and segmentation. The challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures from three different types of surgery. The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap. While the average detection and segmentation quality of the best-performing algorithms is high, future research should concentrate on detection and segmentation of small, crossing, moving and transparent instrument(s) (parts).

preprint2019arXiv

Bi2Te3/Sb2Te3 Heterojunction and Thermophotovoltaic Cells Absorbing the Radiation from Room-temperature Surroundings

The thermophotovoltaic cells which can convert the infrared radiation from room-temperature surroundings into electricity are of significance due to their potential applications in many fields. In this work, narrow bandgap Bi2Te3/Sb2Te3 thin film thermophotovoltaic cells were fabricated, and the formation mechanism of Bi2Te3/Sb2Te3 p-n heterojunctions was investigated. During the formation of the heterojunctions at room temperature, both electrons and holes diffuse in the same direction from n-type Bi2Te3 thin films to p-type Sb2Te3 thin films rather than conventional bi-directional diffusion. Because the strong intrinsic excitation generates a large number of intrinsic carriers which weaken the built-in electric field of the heterojunctions, their I-V curves become similar to straight lines. It is also demonstrated that Bi2Te3/Sb2Te3 thermophotovoltaic cells can output electrical power under the infrared radiation from a room-temperature heat source. This work proves that it is possible to convert the infrared radiation from dark and room-temperature surroundings into electricity through narrow bandgap thermophotovoltaic cells.

preprint2019arXiv

Channel Activation of CHSH Nonlocality

Quantum channels that break CHSH nonlocality on all input states are known as CHSH-breaking channels. In quantum networks, such channels are useless for distributing correlations that can violate the CHSH Inequality. Motivated by previous work on activation of nonlocality in quantum states, here we demonstrate an analogous activation of CHSH-breaking channels. That is, we show that certain pairs of CHSH-breaking channels are no longer CHSH-breaking when used in combination. We find that this type of activation can emerge in both uni-directional and bi-directional communication scenarios.