Researcher profile

Tao Ouyang

Tao Ouyang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2023arXiv

HiFlash: Communication-Efficient Hierarchical Federated Learning with Adaptive Staleness Control and Heterogeneity-aware Client-Edge Association

Federated learning (FL) is a promising paradigm that enables collaboratively learning a shared model across massive clients while keeping the training data locally. However, for many existing FL systems, clients need to frequently exchange model parameters of large data size with the remote cloud server directly via wide-area networks (WAN), leading to significant communication overhead and long transmission time. To mitigate the communication bottleneck, we resort to the hierarchical federated learning paradigm of HiFL, which reaps the benefits of mobile edge computing and combines synchronous client-edge model aggregation and asynchronous edge-cloud model aggregation together to greatly reduce the traffic volumes of WAN transmissions. Specifically, we first analyze the convergence bound of HiFL theoretically and identify the key controllable factors for model performance improvement. We then advocate an enhanced design of HiFlash by innovatively integrating deep reinforcement learning based adaptive staleness control and heterogeneity-aware client-edge association strategy to boost the system efficiency and mitigate the staleness effect without compromising model accuracy. Extensive experiments corroborate the superior performance of HiFlash in model accuracy, communication reduction, and system efficiency.

preprint2022arXiv

Collaboration in Participant-Centric Federated Learning: A Game-Theoretical Perspective

Federated learning (FL) is a promising distributed framework for collaborative artificial intelligence model training while protecting user privacy. A bootstrapping component that has attracted significant research attention is the design of incentive mechanism to stimulate user collaboration in FL. The majority of works adopt a broker-centric approach to help the central operator to attract participants and further obtain a well-trained model. Few works consider forging participant-centric collaboration among participants to pursue an FL model for their common interests, which induces dramatic differences in incentive mechanism design from the broker-centric FL. To coordinate the selfish and heterogeneous participants, we propose a novel analytic framework for incentivizing effective and efficient collaborations for participant-centric FL. Specifically, we respectively propose two novel game models for contribution-oblivious FL (COFL) and contribution-aware FL (CAFL), where the latter one implements a minimum contribution threshold mechanism. We further analyze the uniqueness and existence for Nash equilibrium of both COFL and CAFL games and design efficient algorithms to achieve equilibrium solutions. Extensive performance evaluations show that there exists free-riding phenomenon in COFL, which can be greatly alleviated through the adoption of CAFL model with the optimized minimum threshold.

preprint2022arXiv

Flat-band based ferromagnetic semiconducting state in the graphitic C$_4$N$_3$ monolayer

A new set of lattice-models based on the hexagonal $\sqrt{N}\times\sqrt{N}$ super-cells of the well-known honeycomb lattice with single-hole defect (HL-D-1/2N) are proposed to realize the nontrivial isolated flat-bands. Through performing both tight-binding and density functional theory calculations, we demonstrate that the experimentally realized graphitic carbon nitride (Adv. Mater., 22, 1004, 2010; Nat. Commun., 9, 3366, 2018), the HL-D-1/8 based C$_4$N$_3$, is a perfect system to host such flat bands. For the flat high-energy P-6m2 C$_4$N$_3$ structure, it displays the ferromagnetic half-metallicity which is not related to the isolated flat bands. However, the P-6m2 C$_4$N$_3$ structure is dynamically unstable. Using a structure searching method based on group and graph theory, we find that a new corrugated Pca21 C4N3 structure has the lowest energy among all known C$_4$N$_3$ structures. This Pca21 C$_4$N$_3$ structure is an intrinsic ferromagnetic half-semiconductor (Tc$\approx$241 K) with one semiconducting spin-channel (1.75 eV) and one insulating spin-channel (3.64 eV), which is quite rare in the two-dimensional (2D) systems. Its ferromagnetic semiconducting property originates from the isolated p$_z$-state flat-band as the corrugation shift the flat band upward to the Fermi level. Interestingly, this Pca21 C$_4$N$_3$ structure is found to be piezoelectric and ferroelectric, which makes C$_4$N$_3$ an unusual transition-metal-free 2D multiferroic.

preprint2022arXiv

Separating Data via Block Invalidation Time Inference for Write Amplification Reduction in Log-Structured Storage

Log-structured storage has been widely deployed in various domains of storage systems, yet its garbage collection incurs write amplification (WA) due to the rewrites of live data. We show that there exists an optimal data placement scheme that minimizes WA using the future knowledge of block invalidation time (BIT) of each written block, yet it is infeasible to realize in practice. We propose a novel data placement algorithm for reducing WA, SepBIT, that aims to infer the BITs of written blocks from storage workloads and separately place the blocks into groups with similar estimated BITs. We show via both mathematical and production trace analyses that SepBIT effectively infers the BITs by leveraging the write skewness property in practical storage workloads. Trace analysis and prototype experiments show that SepBIT reduces WA and improves I/O throughput, respectively, compared with state-of-the-art data placement schemes. SepBIT is currently deployed to support the log-structured block storage management at Alibaba Cloud.

preprint2020arXiv

New structure canditates for the experimentally synthesized heptazine-based and triazine-based two dimensional graphitic carbon nitride

The widely used crystal structures for both heptazine-based and triazine-based two-dimensional (2D) graphitic carbon nitride (g-C$_3$N$_4$) are the flat P-6m2 configurations. However, the experimentally synthesized 2D g-C$_3$N$_4$ possess thickness ranging in 0.2-0.5 nm, indicating that the theoretically used flat P-6m2 configurations are not the correct ground states. In this work, we propose three new corrugated structures P321, P3m1 and Pca21 with energies of 66 (86), 77 (87) and 78 (89) meV/atom lower than that of the corresponding heptazine-based (triazine-based) g-C$_3$N$_4$ in flat P-6m2 configuration, respectively. These corrugated structures have very similar periodic patterns to the flat P-6m2 ones and they are difficult to be distinguished from each other according to their top-views. The optimized thicknesses of the three corrugated structures ranging in 1.347-3.142 Å are in good agreement with the experimental results. The first-principles results show that these corrugated structural candidates are also semiconductors with band gaps slightly larger than those of the correspondingly flat P-6m2 ones. Furthermore, they possess also suitable band edge positions for sun-light-driven water-splitting at both $pH=0$ and $pH=7$ environments. Our results show that these three new structures are more promising candidates for the experimentally synthesized g-C$_3$N$_4$.

preprint2020arXiv

Theoretical prediction of a low-energy Stone-Wales graphene with intrinsic type-III Dirac-cone

Based on first-principles method we predict a new low-energy Stone-Wales graphene SW40, which has an orthorhombic lattice with Pbam symmetry and 40 carbon atoms in its crystalline cell forming well-arranged Stone-Wales patterns. The calculated total energy of SW40 is just about 133 meV higher than that of graphene, indicating its excellent stability exceeds all the previously proposed graphene allotropes. We find that SW40 processes intrinsic Type-III Dirac-cone (Phys. Rev. Lett., 120, 237403, 2018) formed by band-crossing of a local linear-band and a local flat-band, which can result in highly anisotropic Fermions in the system. Interestingly, such intrinsic type-III Dirac-cone can be effectively tuned by inner-layer strains and it will be transferred into Type-II and Type-I Dirac-cones under tensile and compressed strains, respectively. Finally, a general tight-binding model was constructed to understand the electronic properties nearby the Fermi-level in SW40. The results show that type-III Dirac-cone feature can be well understood by the $π$-electron interactions between adjacent Stone-Wales defects.