Source author record

Yanbing Zhang

Yanbing Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.optics Computation and Language Computer Vision Artificial Intelligence Graphics Information Theory Machine Learning math.CO math.IT physics.comp-ph quant-ph

Catalog footprint

What is connected

10works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Accelerated simulation of multiscale gas-radiation coupling flows via a general synthetic iterative scheme

Gas-radiation coupling critically influences hypersonic reentry flows, where extreme temperatures induce pronounced non-equilibrium gas and radiative heat transport. Accurate and efficient simulation of radiative gas dynamics is therefore indispensable for reliable design of thermal protection systems for atmospheric entry vehicles. In this study, a Boltzmann-type kinetic model for radiative gas flows is solved across a broad spectrum of flow and radiation transport regimes using the general synthetic iterative scheme (GSIS). The approach integrates an unstructured finite-volume discrete velocity method with a set of macroscopic synthetic equations. Within this framework, the kinetic model provides high-order closures for the constitutive relations in the synthetic equations. Simultaneously, the macroscopic synthetic equations drive the evolution of the mesoscopic kinetic system, significantly accelerating steady-state convergence in near-continuum regimes, as substantiated by linear Fourier stability analysis. Crucially, the algorithm is proven to be asymptotic-preserving, correctly recovering the continuum and optically thick limits, represented by the radiative Navier-Stokes-Fourier equations governing distinct translational, rotational, vibrational, and radiative temperatures, on coarse meshes independent of the mean free path. Numerical simulations of challenging benchmarks, including three-dimensional hypersonic flow over an Apollo reentry capsule, demonstrate that GSIS achieves orders-of-magnitude speedup over conventional iterative schemes in multiscale simulations of radiative gas flows while accurately capturing non-equilibrium effects and radiative heat transfer in hypersonic environments.

preprint2026arXiv

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

We present JoyAI-Image, a unified multimodal foundation model for visual understanding, text-to-image generation, and instruction-guided image editing. JoyAI-Image couples a spatially enhanced Multimodal Large Language Model (MLLM) with a Multimodal Diffusion Transformer (MMDiT), allowing perception and generation to interact through a shared multimodal interface. Around this architecture, we build a scalable training recipe that combines unified instruction tuning, long-text rendering supervision, spatially grounded data, and both general and spatial editing signals. This design gives the model broad multimodal capability while strengthening geometry-aware reasoning and controllable visual synthesis. Experiments across understanding, generation, long-text rendering, and editing benchmarks show that JoyAI-Image achieves state-of-the-art or highly competitive performance. More importantly, the bidirectional loop between enhanced understanding, controllable spatial editing, and novel-view-assisted reasoning enables the model to move beyond general visual competence toward stronger spatial intelligence. These results suggest a promising path for unified visual models in downstream applications such as vision-language-action systems and world models.

preprint2026arXiv

TextLDM: Language Modeling with Continuous Latent Diffusion

Diffusion Transformers (DiT) trained with flow matching in a VAE latent space have unified visual generation across images and videos. A natural next step toward a single architecture for both generation (visual synthesis) and understanding (text generation) is to apply this framework to language modeling. We propose TextLDM, which transfers the visual latent diffusion recipe to text generation with minimal architectural modification. A Transformer-based VAE maps discrete tokens to continuous latents, enhanced by Representation Alignment (REPA) with a frozen pretrained language model to produce representations effective for conditional denoising. A standard DiT then performs flow matching in this latent space, identical in architecture to its visual counterpart. The central challenge we address is obtaining high-quality continuous text representations: we find that reconstruction fidelity alone is insufficient, and that aligning latent features with a pretrained language model via REPA is critical for downstream generation quality. Trained from scratch on OpenWebText2, TextLDM substantially outperforms prior diffusion language models and matches GPT-2 under the same settings. Our results establish that the visual DiT recipe transfers effectively to language, taking a concrete step toward unified diffusion architectures for multimodal generation and understanding.

preprint2026arXiv

Thinking with Novel Views: A Systematic Analysis of Generative-Augmented Spatial Intelligence

Current Large Multimodal Models (LMMs) struggle with spatial reasoning tasks requiring viewpoint-dependent understanding, largely because they are confined to a single, static observation. We propose Thinking with Novel Views (TwNV), a paradigm that integrates generative novel-view synthesis into the reasoning loop: a Reasoner LMM identifies spatial ambiguity, instructs a Painter to synthesize an alternative viewpoint, and re-examines the scene with the additional evidence. Through systematic experiments we address three research questions. (1) Instruction format: numerical camera-pose specifications yield more reliable view control than free-form language. (2) Generation fidelity: synthesized view quality is tightly coupled with downstream spatial accuracy. (3) Inference-time visual scaling: iterative multi-turn view refinement further improves performance, echoing recent scaling trends in language reasoning. Across four spatial subtask categories and four LMM architectures (both closed- and open-source), TwNV consistently improves accuracy by +1.3 to +3.9 pp, with the largest gains on viewpoint-sensitive subtasks. These results establish novel-view generation as a practical lever for advancing spatial intelligence of LMMs.

preprint2020arXiv

A note on a conjecture of star chromatic index for outerplanar graphs

A star edge coloring of a graph $G$ is a proper edge coloring of $G$ without bichromatic paths or cycles of length four. The it star chromatic index, $χ_{st}^{'} (G ),$ of $G$ is the minimum number $k$ for which $G$ has a star edge coloring by $k$ colors. In \cite{LB}, L. Bezegov$\acute{a}$ et al. conjectured that $χ_{st}^{'} (G )\leq \lfloor\frac{3Δ}{2}\rfloor+1$ when $G$ is an outerplanar graph with maximum degree $Δ\geq 3.$ In this paper we obtained that $χ_{st}^{'}(G) \leq Δ+6$ when $G$ is an 2-connected outerplanar graph with diameter 2 or 3. If $G$ is an 2-connected outerplanar graph with maximum degree 5, then $χ_{st}^{'}(G) \leq 9.$

preprint2016arXiv

Correlated photon pair generation in low-loss double-stripe silicon nitride waveguides

We demonstrate correlated photon pair generation via spontaneous four-wave mixing in a low-loss double-stripe silicon nitride waveguide with a coincidence-to-accidental ratio over 10. The coincidence-to-accidental ratio is limited by spontaneous Raman scattering, which can be mitigated by cooling in the future. This demonstration suggests that this waveguide structure is a potential platform to develop integrated quantum photonic chips for quantum information processing.

preprint2015arXiv

Low-error and broadband microwave frequency measurement in a silicon chip

Instantaneous frequency measurement (IFM) of microwave signals is a fundamental functionality for applications ranging from electronic warfare to biomedical technology. Photonic techniques, and nonlinear optical interactions in particular, have the potential to broaden the frequency measurement range beyond the limits of electronic IFM systems. The key lies in efficiently harnessing optical mixing in an integrated nonlinear platform, with low losses. In this work, we exploit the low loss of a 35 cm long, thick silicon waveguide, to efficiently harness Kerr nonlinearity, and demonstrate the first on-chip four-wave mixing (FWM) based IFM system. We achieve a large 40 GHz measurement bandwidth and record-low measurement error. Finally, we discuss the future prospect of integrating the whole IFM system on a silicon chip to enable the first reconfigurable, broadband IFM receiver with low-latency.

preprint2015arXiv

Ultracompact quantum splitter of degenerate photon pairs

Integrated sources of indistinguishable photons have attracted a lot of attention because of their applications in quantum communication and optical quantum computing. Here, we demonstrate an ultra-compact quantum splitter for degenerate single photons based on a monolithic chip incorporating Sagnac loop and a micro-ring resonator with a footprint of 0.011 mm2, generating and deterministically splitting indistinguishable photon pairs using time-reversed Hong-Ou-Mandel interference. The ring resonator provides enhanced photon generation rate, and the Sagnac loop ensures the photons travel through equal path lengths and interfere with the correct phase to enable the reversed HOM effect to take place. In the experiment, we observed a HOM dip visibility of 94.5 +- 3.3 %, indicating the photons generated by the degenerate single photon source are in a suitable state for further integration with other components for quantum applications, such as controlled-NOT gates.

preprint2014arXiv

Pulse Evolution and Phase Sensitive Amplification in Silicon Waveguides

We for the first time provide an analytic solution for pulse propagation and phase sensitive amplification in the regime of high nonlinearity in silicon waveguides including two-photon absorption (TPA) and free carriers. Our analytic results clearly explain why and how the TPA and free carriers affect the signal gain. These observation are confirmed with numerical modelling and experimental results.

preprint2009arXiv

Location-Aided Fast Distributed Consensus in Wireless Networks

Existing works on distributed consensus explore linear iterations based on reversible Markov chains, which contribute to the slow convergence of the algorithms. It has been observed that by overcoming the diffusive behavior of reversible chains, certain nonreversible chains lifted from reversible ones mix substantially faster than the original chains. In this paper, we investigate the idea of accelerating distributed consensus via lifting Markov chains, and propose a class of Location-Aided Distributed Averaging (LADA) algorithms for wireless networks, where nodes' coarse location information is used to construct nonreversible chains that facilitate distributed computing and cooperative processing. First, two general pseudo-algorithms are presented to illustrate the notion of distributed averaging through chain-lifting. These pseudo-algorithms are then respectively instantiated through one LADA algorithm on grid networks, and one on general wireless networks. For a $k\times k$ grid network, the proposed LADA algorithm achieves an $ε$-averaging time of $O(k\log(ε^{-1}))$. Based on this algorithm, in a wireless network with transmission range $r$, an $ε$-averaging time of $O(r^{-1}\log(ε^{-1}))$ can be attained through a centralized algorithm. Subsequently, we present a fully-distributed LADA algorithm for wireless networks, which utilizes only the direction information of neighbors to construct nonreversible chains. It is shown that this distributed LADA algorithm achieves the same scaling law in averaging time as the centralized scheme. Finally, we propose a cluster-based LADA (C-LADA) algorithm, which, requiring no central coordination, provides the additional benefit of reduced message complexity compared with the distributed LADA algorithm.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint