Researcher profile

Dong Zheng

Dong Zheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2024arXiv

Two-Stage Constrained Actor-Critic for Short Video Recommendation

The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users sequentially interact with the system and provide complex and multi-faceted responses, including watch time and various types of interactions with multiple videos. One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning. On the other hand, the platforms also needs to satisfy the constraint of accommodating the responses of multiple user interactions (auxiliary goals) such like, follow, share etc. In this paper, we formulate the problem of short video recommendation as a Constrained Markov Decision Process (CMDP). We find that traditional constrained reinforcement learning algorithms can not work well in this setting. We propose a novel two-stage constrained actor-critic method: At stage one, we learn individual policies to optimize each auxiliary signal. At stage two, we learn a policy to (i) optimize the main signal and (ii) stay close to policies learned at the first stage, which effectively guarantees the performance of this main policy on the auxiliaries. Through extensive offline evaluations, we demonstrate effectiveness of our method over alternatives in both optimizing the main goal as well as balancing the others. We further show the advantage of our method in live experiments of short video recommendations, where it significantly outperforms other baselines in terms of both watch time and interactions. Our approach has been fully launched in the production system to optimize user experiences on the platform.

preprint2022arXiv

Constrained Reinforcement Learning for Short Video Recommendation

The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users provide complex and multi-faceted responses towards recommendations, including watch time and various types of interactions with videos. As a result, established recommendation algorithms that concern a single objective are not adequate to meet this new demand of optimizing comprehensive user experiences. In this paper, we formulate the problem of short video recommendation as a constrained Markov Decision Process (MDP), where platforms want to optimize the main goal of user watch time in long term, with the constraint of accommodating the auxiliary responses of user interactions such as sharing/downloading videos. To solve the constrained MDP, we propose a two-stage reinforcement learning approach based on actor-critic framework. At stage one, we learn individual policies to optimize each auxiliary response. At stage two, we learn a policy to (i) optimize the main response and (ii) stay close to policies learned at the first stage, which effectively guarantees the performance of this main policy on the auxiliaries. Through extensive simulations, we demonstrate effectiveness of our approach over alternatives in both optimizing the main goal as well as balancing the others. We further show the advantage of our approach in live experiments of short video recommendations, where it significantly outperforms other baselines in terms of watch time and interactions from video views. Our approach has been fully launched in the production system to optimize user experiences on the platform.

preprint2021arXiv

X-ray and Radio Studies of the candidate Millisecond Pulsar Binary 4FGL J0935.3+0901

4FGL J0935.5+0901, a $γ$-ray source recently identified as a candidate redback-type millisecond pulsar binary (MSP), shows an interesting feature of having double-peaked emission lines in its optical spectrum. The feature would further suggest the source as a transitional MSP system in the sub-luminous disk state. We have observed the source with \xmm\ and Five-hundred-meter Aperture Spherical radio Telescope (FAST) at X-ray and radio frequencies respectivelyfor further studies. From the X-ray observation, a bimodal count-rate distribution, which is a distinctive feature of the transitional MSP systems, is not detected, while the properties of X-ray variability and power-law spectrum are determined for the source. These results help establish the consistency of it being a redback in the radio pulsar state. However no radio pulsation signals are found in the FAST observation, resulting an upper limit on the flux density of $\sim 4\,μ$Jy. Implications of these results are discussed.

preprint2013arXiv

Pomeranchuk cooling of the SU($2N$) ultra-cold fermions in optical lattices

We investigate the thermodynamic properties of a half-filled SU(2N) Fermi-Hubbard model in the two-dimensional square lattice using the determinantal quantum Monte Carlo simulation, which is free of the fermion "sign problem". The large number of hyperfine-spin components enhances spin fluctuations, which facilitates the Pomeranchuk cooling to temperatures comparable to the superexchange energy scale at the case of SU$(6)$. Various quantities including entropy, charge fluctuation, and spin correlations have been calculated.

preprint2011arXiv

Continuous quantum phase transition between two topologically distinct valence bond solid states associated with the same spin value

We propose a simple one-dimensional spin-2 Hamiltonian, which exhibits two topologically distinct valence bond solid states in different exactly solvable limits. We then construct the phase diagram and study the quantum phase transition between these states using the infinite time evolving block decimation algorithms. From the scaling relation between the entanglement entropy and correlation length, we find that the central charge for the underlying critical conformal field theory is $c=2$.