Researcher profile

Hong Qian

Hong Qian contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2022arXiv

Black-Box Tuning for Language-Model-as-a-Service

Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

preprint2022arXiv

On the Posterior Distribution of a Random Process Conditioned on Empirical Frequencies of a Finite Path: the i.i.d and finite Markov chain case

We obtain the posterior distribution of a random process conditioned on observing the empirical frequencies of a finite sample path. We find under a rather broad assumption on the "dependence structure" of the process, {\em c.f.} independence or Markovian, the posterior marginal distribution of the process at a given time index can be identified as certain empirical distribution computed from the observed empirical frequencies of the sample path. We show that in both cases of discrete-valued i.i.d. sequence and finite Markov chain, a certain "conditional symmetry" given by the observation of the empirical frequencies leads to the desired result on the posterior distribution. Results for both finite-time observations and its asymptotic infinite-time limit are connected via the idea of Gibbs conditioning. Finally, since our results demonstrate a central role of the empirical frequency in understanding the information content of data, we use the Large Deviations Principle (LDP) to construct a general notion of "data-driven entropy", from which one can apply a formalism from the recent study of statistical thermodynamics to data.

preprint2022arXiv

Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games

This paper addresses policy learning in non-stationary environments and games with continuous actions. Rather than the classical reward maximization mechanism, inspired by the ideas of follow-the-regularized-leader (FTRL) and mirror descent (MD) update, we propose a no-regret style reinforcement learning algorithm PORL for continuous action tasks. We prove that PORL has a last-iterate convergence guarantee, which is important for adversarial and cooperative games. Empirical studies show that, in stationary environments such as MuJoCo locomotion controlling tasks, PORL performs equally well as, if not better than, the soft actor-critic (SAC) algorithm; in non-stationary environments including dynamical environments, adversarial training, and competitive games, PORL is superior to SAC in both a better final policy performance and a more stable training process.

preprint2021arXiv

Derivative-Free Reinforcement Learning: A Review

Reinforcement learning is about learning agent models that make the best sequential decisions in unknown environments. In an unknown environment, the agent needs to explore the environment while exploiting the collected information, which usually forms a sophisticated problem to solve. Derivative-free optimization, meanwhile, is capable of solving sophisticated problems. It commonly uses a sampling-and-updating framework to iteratively improve the solution, where exploration and exploitation are also needed to be well balanced. Therefore, derivative-free optimization deals with a similar core issue as reinforcement learning, and has been introduced in reinforcement learning approaches, under the names of learning classifier systems and neuroevolution/evolutionary reinforcement learning. Although such methods have been developed for decades, recently, derivative-free reinforcement learning exhibits attracting increasing attention. However, recent survey on this topic is still lacking. In this article, we summarize methods of derivative-free reinforcement learning to date, and organize the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods. Moreover, we discuss some current limitations and possible future directions, hoping that this article could bring more attentions to this topic and serve as a catalyst for developing novel and efficient approaches.

preprint2020arXiv

Asymptotic Behavior of a Sequence of Conditional Probability Distributions and the Canonical Ensemble

The probability distribution of a function of a subsystem conditioned on the value of the function of the whole, in the limit when the ratio of their values goes to zero, has a limit law: It equals the unconditioned marginal probability distribution weighted by an exponential factor whose exponent is uniquely determined by the condition. We apply this theorem to explain the canonical equilibrium ensemble of a system in contact with a heat reservoir. Since the theorem only requires analysis at the level of the function of the subsystem and reservoir, it is applicable even without the knowledge of the composition of the reservoir itself, which extends the applicability of the canonical ensemble. Furthermore, we generalize our theorem to a model with strong interaction that contributes an additional term to the exponent, which is beyond the typical case of approximately additive functions. This result is new in both physics and mathematics, as a theory for the Gibbs conditioning principle for strongly correlated systems. A corollary provides a precise formulation of what a temperature bath is in probabilistic term

preprint2020arXiv

Representations and Divergences in the Space of Probability Measures and Stochastic Thermodynamics

Radon-Nikodym (RN) derivative between two measures arises naturally in the affine structure of the space of probability measures with densities. Entropy, free energy, relative entropy, and entropy production as mathematical concepts associated with RN derivatives are introduced. We identify a simple equation that connects two measures with densities as a possible mathematical basis of the entropy balance equation that is central in nonequilibrium thermodynamics. Application of this formalism to Gibbsian canonical distribution yields many results in classical thermomechanics. An affine structure based on the canonical represenation and two divergences are introduced in the space of probability measures. It is shown that thermodynamic work, as a conditional expectation, is indictive of the RN derivative between two energy represenations being singular. The entropy divergence and the heat divergence yield respectively a Massieu-Planck potential based and a generalized Carnot inequalities.

preprint2020arXiv

The Statistical Foundation of Entropy in Extended Irreversible Thermodynamics

In the theory of extended irreversible thermodynamics (EIT), the flux-dependent entropy function plays a key role and has a fundamental distinction from the usual flux-independent entropy function adopted by classical irreversible thermodynamics (CIT). However, its existence, as a prerequisite for EIT, and its statistical origin have never been justified. In this work, by studying the macroscopic limit of an ε-dependent Langevin dynamics, which admits a large deviations (LD) principle, we show that the stationary LD rate functions of probability density p_ε(x; t) and joint probability density p_ε(x; \dot{x}; t) actually turn out to be the desired fluxindependent entropy function in CIT and flux-dependent entropy function in EIT respectively. The difference of the two entropy functions is determined by the time resolution for Brownian motions times a Lagrangian, the latter arises from the LD Hamilton-Jacobi equation and can be used for constructing conserved Lagrangian/Hamiltonian dynamics.

preprint2020arXiv

Thermodynamics of Markov Processes with Non-extensive Entropy and Free Energy

Statistical thermodynamics of small systems shows dramatic differences from normal systems. Parallel to the recently presented steady-state thermodynamic formalism for master equation and Fokker-Planck equation, we show that a ``thermodynamic'' theory can also be developed based on Tsallis' generalized entropy $S^{(q)}=\sum_{i=1}^N(p_i-p_i^q)/[q(q-1)]$ and Shiino's generalized free energy $F^{(q)}=[\sum_{i=1}^Np_i(p_i/π_i)^{q-1}-1]/[q(q-1)]$, where $π_i$ is the stationary distribution. $dF^{(q)}/dt=-f_d^{(q)}\le 0$ and it is zero iff the system is in its stationary state. $dS^{(q)}/dt-Q_{ex}^{(q)} = f_d^{(q)}$ where $Q_{ex}^{(q)}$ characterizes the heat exchange. For systems approaching equilibrium with detailed balance, $f_d^{(q)}$ is the product of Onsager's thermodynamic flux and force. However, it is discovered that the Onsager's force is non-local. This is a consequence of the particular transformation invariance for zero energy of Tsallis' statistics.

preprint2020arXiv

Unified formalism for entropy productions and fluctuation relations

Stochastic entropy production, which quantifies the difference between the probabilities of trajectories of a stochastic dynamics and its time reversals, has a central role in nonequilibrium thermodynamics. In the theory of probability, the change in the statistical properties of observables can be represented by a change in the probability measure. We consider operators on the space of probability measure that induce changes in the statistical properties of a process, and formulate entropy productions in terms of these change-of-probability-measure (CPM) operators. This mathematical underpinning of the origin of entropy productions allows us to achieve an organization of various forms of fluctuation relations: All entropy productions have a non-negative mean value, admit the integral fluctuation theorem, and satisfy a rather general fluctuation relation. Other results such as the transient fluctuation theorem and detailed fluctuation theorems then are derived from the general fluctuation relation with more constraints on the operator. We use a discrete-time, discrete-state-space Markov process to draw the contradistinction among three reversals of a process: time reversal, protocol reversal and the dual process. The properties of their corresponding CPM operators are examined, and the domains of validity of various fluctuation relations for entropy productions in physics and chemistry are revealed. We also show that our CPM operator formalism can help us rather easily extend other fluctuations relations for excess work and heat, discuss the martingale properties of entropy productions, and derive the stochastic integral formulas for entropy productions in constant-noise diffusion process with Girsanov theorem. Our formalism provides a general and concise way to study the properties of entropy-related quantities in stochastic thermodynamics and information theory.

preprint2019arXiv

Stochastic Dynamics II: Finite Random Dynamical Systems, Linear Representation, and Entropy Production

We study finite state random dynamical systems (RDS) and their induced Markov chains (MC) as stochastic models for complex dynamics. The linear representation of deterministic maps in RDS are matrix-valued random variables whose expectations correspond to the transition matrix of the MC. The instantaneous Gibbs entropy, Shannon-Khinchin entropy per step, and the entropy production rate of the MC are discussed. These three concepts as key anchor points in stochastic dynamics, characterize respectively the uncertainties of the system at instant time $t$, the randomness generated per step, and the dynamical asymmetry with respect to time reversal. The entropy production rate, expressed in terms of the cycle distributions, has found an expression in terms of the probability of the deterministic maps with the single attractor in the maximum entropy RDS. For finite RDS with invertible transformations, the non-negative entropy production rate of its MC is bounded above by the Kullback-Leibler divergence of the probability of the deterministic maps with respect to its time-reversal dual probability.

preprint2019arXiv

Synchronization in discrete-time, discrete-state Random Dynamical Systems

We characterize synchronization phenomenon in discrete-time, discrete-state random dynamical systems, with random and probabilistic Boolean networks as particular examples. In terms of multiplicative ergodic properties of the induced linear cocycle, we show such a random dynamical system with finite state synchronizes if and only if the Lyapunov exponent $0$ has simple multiplicity. For the case of countable state space, characterization of synchronization is provided in terms of the spectral subspace corresponding to the Lyapunov exponent $-\infty$. In addition, for both cases of finite and countable state spaces, the mechanism of partial synchronization is described by partitioning the state set into synchronized subsets. Applications to biological networks are also discussed.

preprint2019arXiv

Universal Relation Between Thermodynamic Driving Force and One-Way Fluxes in a Nonequilibrium Chemical Reaction with Complex Mechanism

In nonequilibrium chemical reaction systems, a fundamental relationship between unbalanced kinetic one-way fluxes and thermodynamic chemical driving forces is believed to exists. However this relation has been rigorously demonstrated only in a few cases in which one-way fluxes are well defined. In terms of its stochastic kinetic representation, we formulate the one-way fluxes for a general chemical reaction far from equilibrium, with arbitrary complex mechanisms, multiple intermediates, and internal kinetic cycles. For each kinetic cycle, the logarithm of the ratio of the steady-state forward and backward one-way fluxes is equal to the free energy difference between the reactants and products along the cycle. This fundamental relation is further established for general chemical reaction networks with multiple input and output complexes. Our result not only provides an equivalent definition of free energy difference in nonequilibrium chemical reaction networks, it also unifies the stochastic and macroscopic nonequilibrium chemical thermodynamics in a very broad sense.

preprint2010arXiv

An annotated English translation of `Kinetics of stationary reactions' [M. I. Temkin, Dolk. Akad. Nauk SSSR. 152, 156 (1963)]

Temkin's 1963 article on one-way fluxes and flux ratios in steady-state reaction systems bears directly on current research in physical and biological chemistry, such as in the interpretation of metabolic exchange fluxes determined from isotopomer labeling experiments. Yet, originally published in Russian [Dolk. Akad. Nauk SSSR 152, 156-159 (1963)], this article has remained inaccessible to much of the scientific community. Here we provide an English translation of the original article with several additional clarifications and corrections.

preprint2009arXiv

The Physical Origins of Entropy Production, Free Energy Dissipation and their Mathematical Representations

A complete mathematical theory of nonequilibrium thermodynamics of stochastic systems in terms of master equations is presented. As generalizations of isothermal entropy and free energy, two functions of states play central roles: the Gibbs entropy $S$ and the relative entropy $F$, which are related via the stationary distribution of the stochastic dynamics. $S$ satisfies the fundamental entropy balance equation $dS/dt=e_p-h_d/T$ with entropy production rate $e_p\ge 0$ and heat dissipation rate $h_d$, while $dF/dt=-f_d\le 0$. For closed systems that satisfy detailed balance: $Te_p(t)=f_d(t)$. For open system one has $Te_p(t)=f_d(t)+Q_{hk}(t)$ where the housekeeping heat $Q_{hk}\ge 0$ was first introduced in the phenomenological nonequilibrium steady state thermodynamics. Entropy production $e_p$ consists of free energy dissipation associated with spontaneous relaxation, $f_d$, and active energy pumping that sustains the open system $Q_{hk}$. The amount of excess heat involved in the relaxation $Q_{ex}=h_d-Q_{hk} = f_d-T(dS/dt)$.

preprint2009arXiv

Thermodynamic Limit of a Nonequilibrium Steady-State: Maxwell-Type Construction for a Bistable Biochemical System

We show that the thermodynamic limit of a bistable phosphorylation-dephosphorylation cycle has a selection rule for the "more stable" macroscopic steady state. The analysis is akin to the Maxwell construction. Based on the chemical master equation approach, it is shown that, except at a critical point, bistability disappears in the stochastic model when fluctuation is sufficiently low but unneglectable. Onsager's Gaussian fluctuation theory applies to the unique macroscopic steady state. With initial state in the basin of attraction of the "less stable" steady state, the deterministic dynamics obtained by the Law of Mass Action is a metastable phenomenon. Stability and robustness in cell biology are stochastic concepts.