Researcher profile

Xuan Zhu

Xuan Zhu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2020arXiv

Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections

Summarizing data samples by quantitative measures has a long history, with descriptive statistics being a case in point. However, as natural language processing methods flourish, there are still insufficient characteristic metrics to describe a collection of texts in terms of the words, sentences, or paragraphs they comprise. In this work, we propose metrics of diversity, density, and homogeneity that quantitatively measure the dispersion, sparsity, and uniformity of a text collection. We conduct a series of simulations to verify that each metric holds desired properties and resonates with human intuitions. Experiments on real-world datasets demonstrate that the proposed characteristic metrics are highly correlated with text classification performance of a renowned model, BERT, which could inspire future applications.

preprint2020arXiv

Personalized Dialogue Generation with Diversified Traits

Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem of incorporating explicit personality traits in dialogue generation to deliver personalized dialogues. To this end, firstly, we construct PersonalDialog, a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker. This large-scale dataset will facilitate not only the study of personalized dialogue generation, but also other researches on sociolinguistics or social science. Secondly, to study how personality traits can be captured and addressed in dialogue generation, we propose persona-aware dialogue generation models within the sequence to sequence learning framework. Explicit personality traits (structured by key-value pairs) are embedded using a trait fusion module. During the decoding process, two techniques, namely persona-aware attention and persona-aware bias, are devised to capture and address trait-related information. Experiments demonstrate that our model is able to address proper traits in different contexts. Case studies also show interesting results for this challenging research problem.

preprint2019arXiv

Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators

Generative adversarial network (GAN) for image super-resolution (SR) has attracted enormous interests in recent years. However, the GAN-based SR methods only use image discriminator to distinguish SR images and high-resolution (HR) images. Image discriminator fails to discriminate images accurately since image features cannot be fully expressed. In this paper, we design a new GAN-based SR framework GAN-IMC which includes generator, image discriminator, morphological component discriminator and color discriminator. The combination of multiple feature discriminators improves the accuracy of image discrimination. Adversarial training between the generator and multi-feature discriminators forces SR images to converge with HR images in terms of data and features distribution. Moreover, in some cases, feature enhancement of salient regions is also worth considering. GAN-IMC is further optimized by weighted content loss (GAN-IMCW), which effectively restores and enhances salient regions in SR images. The effectiveness and robustness of our method are confirmed by extensive experiments on public datasets. Compared with state-of-the-art methods, the proposed method not only achieves competitive Perceptual Index (PI) and Natural Image Quality Evaluator (NIQE) values but also obtains pleasant visual perception in image edge, texture, color and salient regions.