Researcher profile

Wancong Zhang

Wancong Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2021arXiv

MixUp Training Leads to Reduced Overfitting and Improved Calibration for the Transformer Architecture

MixUp is a computer vision data augmentation technique that uses convex interpolations of input data and their labels to enhance model generalization during training. However, the application of MixUp to the natural language understanding (NLU) domain has been limited, due to the difficulty of interpolating text directly in the input space. In this study, we propose MixUp methods at the Input, Manifold, and sentence embedding levels for the transformer architecture, and apply them to finetune the BERT model for a diverse set of NLU tasks. We find that MixUp can improve model performance, as well as reduce test loss and model calibration error by up to 50%.

preprint2020arXiv

A Comprehensive Evaluation of Multi-task Learning and Multi-task Pre-training on EHR Time-series Data

Multi-task learning (MTL) is a machine learning technique aiming to improve model performance by leveraging information across many tasks. It has been used extensively on various data modalities, including electronic health record (EHR) data. However, despite significant use on EHR data, there has been little systematic investigation of the utility of MTL across the diverse set of possible tasks and training schemes of interest in healthcare. In this work, we examine MTL across a battery of tasks on EHR time-series data. We find that while MTL does suffer from common negative transfer, we can realize significant gains via MTL pre-training combined with single-task fine-tuning. We demonstrate that these gains can be achieved in a task-independent manner and offer not only minor improvements under traditional learning, but also notable gains in a few-shot learning context, thereby suggesting this could be a scalable vehicle to offer improved performance in important healthcare contexts.