Researcher profile

Zhen Shen

Zhen Shen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

HiDream-O1-Image: A Natively Unified Image Generative Foundation Model with Pixel-level Unified Transformer

The evolution of visual generative models has long been constrained by fragmented architectures relying on disjoint text encoders and external VAEs. In this report, we present HiDream-O1-Image, a natively unified generative foundation model via pixel-space Diffusion Transformer, that pioneers a paradigm shift from modular architectures to an end-to-end in-context visual generation engine. By mapping raw image pixels, text tokens, and task-specific conditions into a single shared token space, HiDream-O1-Image achieves a structural unification of multimodal inputs within an Unified Transformer (UiT) architecture. This native encoding paradigm eliminates the need for separate VAEs or disjoint pre-trained text encoders, allowing the model to treat diverse generation and editing tasks as a consistent in-context reasoning process. Extensive experiments show that HiDream-O1-Image excels across various generation tasks, including text-to-image generation, instruction-based editing, and subject-driven personalization. Notably, with only 8B parameters, HiDream-O1-Image (8B) achieves performance parity with or even surpasses established state-of-the-art models with significantly larger parameters (e.g., 27B Qwen-Image). Crucially, to validate the immense scalability of this paradigm, we successfully scale the architecture up to over 200B parameters. Experimental results demonstrate that this massive-scale version HiDream-O1-Image-Pro (200B+) unlocks unprecedented generative capabilities and superior performance, establishing new state-of-the-art benchmarks. Ultimately, HiDream-O1-Image highlights the immense potential of natively unified architectures and charts a highly scalable path toward next-generation multimodal AI.

preprint2022arXiv

Non-reciprocal frequency conversion and mode routing in a microresonator

The transportation of photons and phonons typically obeys the principle of reciprocity. Breaking reciprocity of these bosonic excitations will enable the corresponding non-reciprocal devices, such as isolators and circulators. Here, we use two optical modes and two mechanical modes in a microresonator to form a four-mode plaquette via radiation pressure force. The phase-controlled non-reciprocal routing between any two modes with completely different frequencies is demonstrated, including the routing of phonon to phonon (MHz to MHz), photon to phonon (THz to MHz), and especially photon to photon with frequency difference of around 80 THz for the first time. In addition, one more mechanical mode is introduced to this plaquette to realize a phononic circulator in such single microresonator. The non-reciprocity is derived from interference between multi-mode transfer processes involving optomechanical interactions in an optomechanical resonator. It not only demonstrates the non-reciprocal routing of photons and phonons in a single resonator but also realizes the non-reciprocal frequency conversion for photons and circulation for phonons, laying a foundation for studying directional routing and thermal management in an optomechanical hybrid network.

preprint2019arXiv

Enhanced optomechanical entanglement and cooling via dissipation engineering

We propose an optomechanical dissipation engineering scheme by introducing an ancillary mechanical mode with a large decay rate to control the density of states of the optical mode. The effective linewidth of the optical mode can be reduced or broadened, manifesting the dissipation engineering. To prove the ability of our scheme in improving the performances of the optomechanical system, we studied optomechanical entanglement and phonon cooling. It is demonstrated that the optomechanical entanglement overwhelmed by thermal phonon excitations could be restored via dissipation engineering. For the phonon cooling, an order of magnitude improvement could be achieved. Our scheme can be generalized to other systems with multiple bosonic modes, which is experimentally feasible with advances in materials and nanofabrication, including optical Fabry-Perot cavities, superconducting circuits, and nanobeam photonic crystals.

preprint2019arXiv

Polarization mode hybridization and conversion in phononic wire waveguides

Phononic wire waveguides of subwavelength cross-section support two orthogonal polarization modes: the out-of-plane motion dominated Rayleigh-like and the in-plane motion dominated Love-like modes, analogous to transverse-electric and transverse-magnetic modes in photonic waveguides. Due to the anisotropic elasticity of the substrate material, the polarization states of phonons propagating along certain crystallographic orientations can strongly hybridize. Here we experimentally investigate the orientation-dependent mode hybridization in phononic wire waveguides patterned from GaN-on-sapphire thin films. Such mode hybridization allows efficient actuation of piezoelectrically inactive Love-like modes using common interdigital electrodes designed for Rayleigh-like modes, and further enables on-chip polarization conversion between guided transverse modes. Both are important for on-chip implementation of complex phononic circuits.