Source author record

Jihai Zhang

Jihai Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning cond-mat.mtrl-sci

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

IIB-LPO: Latent Policy Optimization via Iterative Information Bottleneck

Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) for Large Language Model (LLM) reasoning have been hindered by a persistent challenge: exploration collapse. The semantic homogeneity of random rollouts often traps models in narrow, over-optimized behaviors. While existing methods leverage policy entropy to encourage exploration, they face inherent limitations. Global entropy regularization is susceptible to reward hacking, which can induce meaningless verbosity, whereas local token-selective updates struggle with the strong inductive bias of pre-trained models. To address this, we propose Latent Policy Optimization via Iterative Information Bottleneck (IIB-LPO), a novel approach that shifts exploration from statistical perturbation of token distributions to topological branching of reasoning trajectories. IIB-LPO triggers latent branching at high-entropy states to diversify reasoning paths and employs the Information Bottleneck principle both as a trajectory filter and a self-reward mechanism, ensuring concise and informative exploration. Empirical results across four mathematical reasoning benchmarks demonstrate that IIB-LPO achieves state-of-the-art performance, surpassing prior methods by margins of up to 5.3% in accuracy and 7.4% in diversity metrics.

preprint2021arXiv

Synergetic Learning of Heterogeneous Temporal Sequences for Multi-Horizon Probabilistic Forecasting

Time-series is ubiquitous across applications, such as transportation, finance and healthcare. Time-series is often influenced by external factors, especially in the form of asynchronous events, making forecasting difficult. However, existing models are mainly designated for either synchronous time-series or asynchronous event sequence, and can hardly provide a synthetic way to capture the relation between them. We propose Variational Synergetic Multi-Horizon Network (VSMHN), a novel deep conditional generative model. To learn complex correlations across heterogeneous sequences, a tailored encoder is devised to combine the advances in deep point processes models and variational recurrent neural networks. In addition, an aligned time coding and an auxiliary transition scheme are carefully devised for batched training on unaligned sequences. Our model can be trained effectively using stochastic variational inference and generates probabilistic predictions with Monte-Carlo simulation. Furthermore, our model produces accurate, sharp and more realistic probabilistic forecasts. We also show that modeling asynchronous event sequences is crucial for multi-horizon time-series forecasting.

preprint2019arXiv

Single-Layer CrI3 Grown by Molecular Beam Epitaxy

Single- and few-layer chromium triiodide (CrI3), which has been intensively investigated as a promising platform for two-dimensional magnetism, was usually prepared by mechanical exfoliation. Here, we report on the growth of single-layer CrI3 by molecular beam epitaxy under ultrahigh vacuum. The atomic structures and local density of states have been revealed by scanning tunneling microscopy (STM). Iodine trimers, each of which consists of three I atoms surrounding a three-fold Cr honeycomb center, have been identified as the basic units of the topmost I layer. Different superstructures of single-layer CrI3 with characteristic periodicity around 2-4 nm were obtained on Au(111), but only pristine structure was observed on graphite. At elevated temperatures (423 K), CrI3 was partially decomposed, resulting in the formation of single-layer chromium diiodide. Our bias-dependent STM images suggest that the unoccupied and occupied states are distributed spatial-separately, which is consistent with our density functional theory calculations. The effect of charge distribution on the superexchange interaction in single-layer CrI3 was discussed.