Source author record

Yun Hua

Yun Hua appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.CA Artificial Intelligence Computer Science and Game Theory Machine Learning Multiagent Systems

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Structured Diversification Emergence via Reinforced Organization Control and Hierarchical Consensus Learning

When solving a complex task, humans will spontaneously form teams and to complete different parts of the whole task, respectively. Meanwhile, the cooperation between teammates will improve efficiency. However, for current cooperative MARL methods, the cooperation team is constructed through either heuristics or end-to-end blackbox optimization. In order to improve the efficiency of cooperation and exploration, we propose a structured diversification emergence MARL framework named {\sc{Rochico}} based on reinforced organization control and hierarchical consensus learning. {\sc{Rochico}} first learns an adaptive grouping policy through the organization control module, which is established by independent multi-agent reinforcement learning. Further, the hierarchical consensus module based on the hierarchical intentions with consensus constraint is introduced after team formation. Simultaneously, utilizing the hierarchical consensus module and a self-supervised intrinsic reward enhanced decision module, the proposed cooperative MARL algorithm {\sc{Rochico}} can output the final diversified multi-agent cooperative policy. All three modules are organically combined to promote the structured diversification emergence. Comparative experiments on four large-scale cooperation tasks show that {\sc{Rochico}} is significantly better than the current SOTA algorithms in terms of exploration efficiency and cooperation strength.

preprint2014arXiv

A double inequality for bounding Toader mean by the centroidal mean

In the paper, the authors find the best numbers $α$ and $β$ such that $$ \overline{C}\bigl(αa+(1-α)b,αb+(1-α)a\bigr)<T(a,b) <\overline{C}\bigl(βa+(1-β)b,βb+(1-β)a\bigr) $$ for all $a,b>0$ with $a\ne b$, where $\overline{C}(a,b)={2\bigl(a^2+ab+b^2\bigr)}{3(a+b)}$ and $T(a,b)=\frac{2}π\int_{0}^{π/{2}}\sqrt{a^2{\cos^2θ}+b^2{\sin^2θ}}\,dθ$ denote respectively the centroidal mean and Toader mean of two positive numbers $a$ and $b$.

preprint2013arXiv

The best bounds for Toader mean in terms of the centroidal and arithmetic means

In the paper, the authors discover the best constants $α_{1}$, $α_{2}$, $β_{1}$, and $β_{2}$ for the double inequalities $$ α_{1}\bar{C}(a,b)+(1-α_{1}) A(a,b)< T(a,b) <β_{1} \bar{C}(a,b)+(1-β_{1})A(a,b) $$ and $$ \frac{α_{2}}{A(a,b)}+\frac{1-α_{2}}{\bar{C}(a,b)}<\frac1{T(a,b)} <\frac{β_{2}}{A(a,b)}+\frac{1-β_{2}}{\bar{C}(a,b)} $$ to be valid for all $a,b>0$ with $a\ne b$, where $$ \bar{C}(a,b)=\frac{2(a^{2}+ab+b^{2})}{3(a+b)},\quad A(a,b)=\frac{a+b}2, $$ and $$ T(a,b)=\frac{2}π\int_{0}^{π/{2}}\sqrt{a^2{\cos^2θ}+b^2{\sin^2θ}}\,\tdθ$$ are respectively the centroidal, arithmetic, and Toader means of two positive numbers $a$ and $b$. As an application of the above inequalities, the authors also find some new bounds for the complete elliptic integral of the second kind.