Source author record

Ping Yu

Ping Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation and Language quant-ph Artificial Intelligence Computer Vision Cryptography and Security Graphics physics.optics

Catalog footprint

What is connected

11works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

IronForge: An Open, Secure, Fair, Decentralized Federated Learning

Federated learning (FL) provides an effective machine learning (ML) architecture to protect data privacy in a distributed manner. However, the inevitable network asynchrony, the over-dependence on a central coordinator, and the lack of an open and fair incentive mechanism collectively hinder its further development. We propose \textsc{IronForge}, a new generation of FL framework, that features a Directed Acyclic Graph (DAG)-based data structure and eliminates the need for central coordinators to achieve fully decentralized operations. \textsc{IronForge} runs in a public and open network, and launches a fair incentive mechanism by enabling state consistency in the DAG, so that the system fits in networks where training resources are unevenly distributed. In addition, dedicated defense strategies against prevalent FL attacks on incentive fairness and data privacy are presented to ensure the security of \textsc{IronForge}. Experimental results based on a newly developed testbed FLSim highlight the superiority of \textsc{IronForge} to the existing prevalent FL frameworks under various specifications in performance, fairness, and security. To the best of our knowledge, \textsc{IronForge} is the first secure and fully decentralized FL framework that can be applied in open networks with realistic network and training settings.

preprint2022arXiv

Efficient Language Modeling with Sparse all-MLP

All-MLP architectures have attracted increasing interest as an alternative to attention-based models. In NLP, recent work like gMLP shows that all-MLPs can match Transformers in language modeling, but still lag behind in downstream tasks. In this work, we analyze the limitations of MLPs in expressiveness, and propose sparsely activated MLPs with mixture-of-experts (MoEs) in both feature and input (token) dimensions. Such sparse all-MLPs significantly increase model capacity and expressiveness while keeping the compute constant. We address critical challenges in incorporating conditional computation with two routing strategies. The proposed sparse all-MLP improves language modeling perplexity and obtains up to 2$\times$ improvement in training efficiency compared to both Transformer-based MoEs (GShard, Switch Transformer, Base Layers and HASH Layers) as well as dense Transformers and all-MLPs. Finally, we evaluate its zero-shot in-context learning performance on six downstream tasks, and find that it surpasses Transformer-based MoEs and dense Transformers.

preprint2022arXiv

STT: Soft Template Tuning for Few-Shot Adaptation

Prompt tuning has been an extremely effective tool to adapt a pre-trained model to downstream tasks. However, standard prompt-based methods mainly consider the case of sufficient data of downstream tasks. It is still unclear whether the advantage can be transferred to the few-shot regime, where only limited data are available for each downstream task. Although some works have demonstrated the potential of prompt-tuning under the few-shot setting, the main stream methods via searching discrete prompts or tuning soft prompts with limited data are still very challenging. Through extensive empirical studies, we find that there is still a gap between prompt tuning and fully fine-tuning for few-shot learning. To bridge the gap, we propose a new prompt-tuning framework, called Soft Template Tuning (STT). STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task. Comprehensive evaluation on different settings suggests STT can close the gap between fine-tuning and prompt-based methods without introducing additional parameters. Significantly, it can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.

preprint2021arXiv

Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck

Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. However, the sequential text generation common pitfall with VAEs is that the model tends to ignore latent variables with a strong auto-regressive decoder. In this paper, we propose a principled approach to alleviate this issue by applying a discretized bottleneck to enforce an implicit latent feature matching in a more compact latent space. We impose a shared discrete latent space where each input is learned to choose a combination of latent atoms as a regularized latent representation. Our model endows a promising capability to model underlying semantics of discrete sequences and thus provide more interpretative latent structures. Empirically, we demonstrate our model's efficiency and effectiveness on a broad range of tasks, including language modeling, unaligned text style transfer, dialog response generation, and neural machine translation.

preprint2021arXiv

SDA: Improving Text Generation with Self Data Augmentation

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of natural languages. In this paper, we propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation. Unlike most existing sentence-level augmentation strategies, which are only applied to specific models, our method is more general and could be easily adapted to any MLE-based training procedure. In addition, our framework allows task-specific evaluation metrics to be designed to flexibly control the generated sentences, for example, in terms of controlling vocabulary usage and avoiding nontrivial repetitions. Extensive experimental results demonstrate the superiority of our method on two synthetic and several standard real datasets, significantly improving related baselines.

preprint2020arXiv

Feature Quantization Improves GAN Training

The instability in GAN training has been a long-standing problem despite remarkable research efforts. We identify that instability issues stem from difficulties of performing feature matching with mini-batch statistics, due to a fragile balance between the fixed target distribution and the progressively generated distribution. In this work, we propose Feature Quantization (FQ) for the discriminator, to embed both true and fake data samples into a shared discrete space. The quantized values of FQ are constructed as an evolving dictionary, which is consistent with feature statistics of the recent distribution history. Hence, FQ implicitly enables robust feature matching in a compact space. Our method can be easily plugged into existing GAN models, with little computational overhead in training. We apply FQ to 3 representative GAN models on 9 benchmarks: BigGAN for image generation, StyleGAN for face synthesis, and U-GAT-IT for unsupervised image-to-image translation. Extensive experimental results show that the proposed FQ-GAN can improve the FID scores of baseline methods by a large margin on a variety of tasks, achieving new state-of-the-art performance.

preprint2020arXiv

Structure-Aware Human-Action Generation

Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence. Most existing methods borrow ideas from video generation, which naively treat skeleton nodes/joints as pixels of images without considering the rich inter-frame and intra-frame structure information, leading to potential distorted actions. Graph convolutional networks (GCNs) is a promising way to leverage structure information to learn structure representations. However, directly adopting GCNs to tackle such continuous action sequences both in spatial and temporal spaces is challenging as the action graph could be huge. To overcome this issue, we propose a variant of GCNs to leverage the powerful self-attention mechanism to adaptively sparsify a complete action graph in the temporal space. Our method could dynamically attend to important past frames and construct a sparse graph to apply in the GCN framework, well-capturing the structure information in action sequences. Extensive experimental results demonstrate the superiority of our method on two standard human action datasets compared with existing methods.

preprint2014arXiv

Full control of polarization states and phase distributions of light with dual-metasurfaces

Control of the phase and polarization states of light is an important goal for nearly all optical research. The development of an efficient optical component that allows the simultaneous manipulation of the polarization and phase distribution is needed. Traditional methods require the combination of multiple optical devices, and a single optical device cannot easily realize full control of light. We theoretically predict and experimentally verify that our proposed dual-metasurfaces provide an excellent means to simultaneously manipulate the phase and polarization of transmission light at the nanoscale. By introducing a phase gradient along the interface, we achieved a near-perfect anomalous refraction with controllable polarization in the near-infrared region. On the basis of these properties, we created a dual-metasurface capable of generating radially polarized beam, demonstrating the power of full control of light. This work opens exciting avenues toward improving the degrees of freedom in the manipulation of light, including the propagation direction and distribution of the polarization and phase, and may profoundly affect a wide range of plasmonic applications.

preprint2012arXiv

Entanglement and genuine entanglement of three qubit GHZ diagonal states

We analytically prove the necessary and sufficient criterion for the full separability of three-qubit Greenberger-Horne-Zeilinger (GHZ) diagonal states. The corresponding entanglement is exactly calculable for some GHZ diagonal states and is tractable for the others using the relative entropy of entanglement. We show that the biseparable criterion and the genuine entanglement are determined only by the biggest GHZ diagonal element regardless of all the other smaller diagonal elements. We have completely solved the entanglement problems of three-qubit GHZ diagonal states.

preprint2012arXiv

Genuine Entanglement of Four Qubit Cluster Diagonal States

We reduce the necessary and sufficient biseparable conditions of the four qubit cluster diagonal state to concise forms. Only 4 out of the 15 parameters are proved to be relevant in specifying the genuine entanglement of the state. Using the relative entropy of entanglement as the entanglement measure, we analytically find the genuine entanglement of all the four qubit cluster diagonal states. The formulas of the genuine entanglement are of five kinds, for seven different parameter regions of entanglement.

preprint2012arXiv

Realignment Entanglement Criterion for Continuous Bipartite Symmetric Quantum States

The separability of bipartite non-Gaussian states is studied by applying the realignment criterion with the technique of functional analysis. The realignment criterion is given as one inequality in contrast to the infinitive number of inequalities based on the moments. We give the necessary and sufficient condition of inseparability for non-Gaussian states prepared by photon subtraction or addition from symmetric Gaussian states. The entanglement criterion of non-Gaussian states evolved in thermal noise and amplitude damping environment is also obtained.

Ping Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

IronForge: An Open, Secure, Fair, Decentralized Federated Learning

Efficient Language Modeling with Sparse all-MLP

STT: Soft Template Tuning for Few-Shot Adaptation

Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck

SDA: Improving Text Generation with Self Data Augmentation

Feature Quantization Improves GAN Training

Structure-Aware Human-Action Generation

Full control of polarization states and phase distributions of light with dual-metasurfaces

Entanglement and genuine entanglement of three qubit GHZ diagonal states

Genuine Entanglement of Four Qubit Cluster Diagonal States

Realignment Entanglement Criterion for Continuous Bipartite Symmetric Quantum States