Source author record

Fei Feng

Fei Feng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence eess.SY math.OC Systems and Control

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity

In this paper, we propose AsyncQVI, an asynchronous-parallel Q-value iteration for discounted Markov decision processes whose transition and reward can only be sampled through a generative model. Given such a problem with $|\mathcal{S}|$ states, $|\mathcal{A}|$ actions, and a discounted factor $γ\in(0,1)$, AsyncQVI uses memory of size $\mathcal{O}(|\mathcal{S}|)$ and returns an $\varepsilon$-optimal policy with probability at least $1-δ$ using $$\tilde{\mathcal{O}}\big(\frac{|\mathcal{S}||\mathcal{A}|}{(1-γ)^5\varepsilon^2}\log(\frac{1}δ)\big)$$ samples. AsyncQVI is also the first asynchronous-parallel algorithm for discounted Markov decision processes that has a sample complexity, which nearly matches the theoretical lower bound. The relatively low memory footprint and parallel ability make AsyncQVI suitable for large-scale applications. In numerical tests, we compare AsyncQVI with four sample-based value iteration methods. The results show that our algorithm is highly efficient and achieves linear parallel speedup.

preprint2020arXiv

Enhanced Microgrid Power Flow Incorporating Hierarchical Control

An enhanced microgrid power flow (EMPF) is devised to incorporate hierarchical control effects. The new contributions are threefold: 1) an advanced-hierarchical-control-based Newton approach is established to accurately assess power sharing and voltage regulation effects; 2) a modified Jacobian matrix is derived to incorporate droop control and various secondary control modes; and 3) the secondary adjustment is calculated on top of the droop-control-based power flow results to ensure a robust Newton solution. Case studies validate that EMPF is efficacious and efficient and can serve as a powerful tool for microgrid operation and monitoring, especially for those highly meshed microgrids in urban areas.

preprint2020arXiv

How Does an Approximate Model Help in Reinforcement Learning?

One of the key approaches to save samples in reinforcement learning (RL) is to use knowledge from an approximate model such as its simulator. However, how much does an approximate model help to learn a near-optimal policy of the true unknown model? Despite numerous empirical studies of transfer reinforcement learning, an answer to this question is still elusive. In this paper, we study the sample complexity of RL while an approximate model of the environment is provided. For an unknown Markov decision process (MDP), we show that the approximate model can effectively reduce the complexity by eliminating sub-optimal actions from the policy searching space. In particular, we provide an algorithm that uses $\widetilde{O}(N/(1-γ)^3/\varepsilon^2)$ samples in a generative model to learn an $\varepsilon$-optimal policy, where $γ$ is the discount factor and $N$ is the number of near-optimal actions in the approximate model. This can be much smaller than the learning-from-scratch complexity $\widetildeΘ(SA/(1-γ)^3/\varepsilon^2)$, where $S$ and $A$ are the sizes of state and action spaces respectively. We also provide a lower bound showing that the above upper bound is nearly-tight if the value gap between near-optimal actions and sub-optimal actions in the approximate model is sufficiently large. Our results provide a very precise characterization of how an approximate model helps reinforcement learning when no additional assumption on the model is posed.

Fei Feng

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity

Enhanced Microgrid Power Flow Incorporating Hierarchical Control

How Does an Approximate Model Help in Reinforcement Learning?