Researcher profile

Yijin Zhang

Yijin Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Emergent Cooperative Superstructures via Order-Disorder Kinetics in Molecule-Intercalated NbSe2

The design of quantum states at heterointerfaces has enabled a variety of emergent phenomena. Among them, molecular intercalation superlattices have attracted attention as tunable hybrid materials, formed by inserting organic molecules into van der Waals crystals, where molecular structure and chemistry provide new degrees of freedom. Traditionally, the intercalated molecules have been regarded as inactive spacers, while possible molecular ordering and its impact on the host lattice have remained largely unexplored. Here, we report the discovery of a cooperative superstructure (CSS) phase in molecule intercalated NbSe2, where ordering of the guest molecules induce a concomitant superstructure in the NbSe2 host lattice, characterized by a moiré structure due to incommensurability between the molecular layer and the inorganic lattice. Synchrotron X-ray diffraction reveals the emergence of CSS phase, accompanied by crystal symmetry lowering. Complementary resistivity and thermal-quench measurements show that the transition is governed by unusually slow order-disorder kinetics, so that the CSS phase can be selectively accessed under standard laboratory cooling rates. This kinetic behavior arises from slow molecular dynamics coupled to the host lattice, contrasting with fast charge or magnetic ordering in inorganic solids. Our findings establish molecular ordering as a route for engineering heterointerfaces, enabling thermally programmable superstructures.

preprint2022arXiv

Collaborative Intelligent Reflecting Surface Networks with Multi-Agent Reinforcement Learning

Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks. In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting. Aiming to maximize the long-term average achievable system rate, an optimization problem is formulated by jointly designing the transmit beamforming at the base station (BS) and discrete phase shift beamforming at the IRSs, with the constraints on transmit power, user data rate requirement and IRS energy buffer size. Considering time-varying channels and stochastic arrivals of energy harvested by the IRSs, we first formulate the problem as a Markov decision process (MDP) and then develop a novel multi-agent Q-mix (MAQ) framework with two layers to decouple the optimization parameters. The higher layer is for optimizing phase shift resolutions, and the lower one is for phase shift beamforming and power allocation. Since the phase shift optimization is an integer programming problem with a large-scale action space, we improve MAQ by incorporating the Wolpertinger method, namely, MAQ-WP algorithm to achieve a sub-optimality with reduced dimensions of action space. In addition, as MAQ-WP is still of high complexity to achieve good performance, we propose a policy gradient-based MAQ algorithm, namely, MAQ-PG, by mapping the discrete phase shift actions into a continuous space at the cost of a slight performance loss. Simulation results demonstrate that the proposed MAQ-WP and MAQ-PG algorithms can converge faster and achieve data rate improvements of 10.7% and 8.8% over the conventional multi-agent DDPG, respectively.

preprint2022arXiv

Reinforcement Learning for Improved Random Access in Delay-Constrained Heterogeneous Wireless Networks

In this paper, we for the first time investigate the random access problem for a delay-constrained heterogeneous wireless network. We begin with a simple two-device problem where two devices deliver delay-constrained traffic to an access point (AP) via a common unreliable collision channel. By assuming that one device (called Device 1) adopts ALOHA, we aim to optimize the random access scheme of the other device (called Device 2). The most intriguing part of this problem is that Device 2 does not know the information of Device 1 but needs to maximize the system timely throughput. We first propose a Markov Decision Process (MDP) formulation to derive a model-based upper bound so as to quantify the performance gap of certain random access schemes. We then utilize reinforcement learning (RL) to design an R-learning-based random access scheme, called tiny state-space R-learning random access (TSRA), which is subsequently extended for the tackling of the general multi-device problem. We carry out extensive simulations to show that the proposed TSRA simultaneously achieves higher timely throughput, lower computation complexity, and lower power consumption than the existing baseline--deep-reinforcement learning multiple access (DLMA). This indicates that our proposed TSRA scheme is a promising means for efficient random access over massive mobile devices with limited computation and battery capabilities.

preprint2021arXiv

Impact of Low-Resolution ADC on DOA Estimation Performance for Massive MIMO Receive Array

In this paper, we present a new scenario of direction of arrival (DOA) estimation using massive multiple-input multiple-output (MIMO) receive array with low-resolution analog-to-digital convertors (ADCs), which can strike a good balance between performance and circuit cost. Based on the linear additive quantization noise model (AQNM), the effect of low-resolution ADCs on the methods, such as Root-MUSIC method, is analyzed. Also, the closed-form expression of Cramer-Rao lower bound (CRLB) is derived to evaluate the performance loss caused by the low-resolution ADCs. The simulation results show that the Root-MUSIC methods can achieve the corresponding CRLB. Furthermore, 2-3 bits are acceptable for most applications if the 1dB performance loss.

preprint2020arXiv

Age-of-Information-based Scheduling in Multiuser Uplinks with Stochastic Arrivals: A POMDP Approach

In this paper, we consider a multiuser uplink status update system, where a monitor aims to timely collect randomly generated status updates from multiple end nodes through a shared wireless channel. We adopt the recently proposed metric, termed age of information (AoI), to quantify the information timeliness and freshness. Due to the random generation of the status updates at the end node side, the monitor only grasps a partial knowledge of the status update arrivals. Under such a practical scenario, we aim to address a fundamental multiuser scheduling problem: how to schedule the end nodes to minimize the network-wide AoI? To solve this problem, we formulate it as a partially observable Markov decision process (POMDP), and develop a dynamic programming (DP) algorithm to obtain the optimal scheduling policy. By noting that the optimal policy is computationally prohibitive, we further design a low-complexity myopic policy that only minimizes the one-step expected reward. Simulation results show that the performance of the myopic policy can approach that of the optimal policy, and is better than that of the baseline policy.

preprint2020arXiv

Dynamic Virtual Resource Allocation for 5G and Beyond Network Slicing

The fifth generation and beyond wireless communication will support vastly heterogeneous services and use demands such as massive connection, low latency and high transmission rate. Network slicing has been envisaged as an efficient technology to meet these diverse demands. In this paper, we propose a dynamic virtual resources allocation scheme based on the radio access network (RAN) slicing for uplink communications to ensure the quality-of-service (QoS). To maximum the weighted-sum transmission rate performance under delay constraint, formulate a joint optimization problem of subchannel allocation and power control as an infinite-horizon average-reward constrained Markov decision process (CMDP) problem. Based on the equivalent Bellman equation, the optimal control policy is first derived by the value iteration algorithm. However, the optimal policy suffers from the widely known curse-of-dimensionality problem. To address this problem, the linear value function approximation (approximate dynamic programming) is adopted. Then, the subchannel allocation Q-factor is decomposed into the per-slice Q-factor. Furthermore, the Q-factor and Lagrangian multipliers are updated by the use of an online stochastic learning algorithm. Finally, simulation results reveal that the proposed algorithm can meet the delay requirements and improve the user transmission rate compared with baseline schemes.

preprint2020arXiv

Schedule Sequence Design for Broadcast in Multi-channel Ad Hoc Networks

We consider a single-hop ad hoc network in which each node aims to broadcast packets to its neighboring nodes by using multiple slotted, TDD collision channels. There is no cooperation among the nodes. To ensure successful broadcast, we propose to pre-assign each node a periodic sequence to schedule transmissions and receptions at each time slot. These sequences are referred to as schedule sequences. Since each node starts its transmission schedule independently, there exist relative time offsets among the schedule sequences they use. Our objective is to design schedule sequences such that each node can transmit at least one packet to each of its neighbors successfully within a common period, no matter what the time offsets are. The sequence period should be designed as short as possible. In this paper, we analyze the lower bound on sequence period, and propose a sequence construction method by which the period can achieve the same order as the lower bound. We also consider the random scheme in which each node transmits or receives on a channel at each time slot with a pre-determined probability. The frame length and broadcast completion time under different schemes are compared by numerical studies.