Source author record

Xiaokang Liu

Xiaokang Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Biological Physics Computation and Language Machine Learning Methodology quant-ph Applications Biomolecules cond-mat.soft gr-qc hep-th Information Theory math.IT physics.chem-ph physics.class-ph

Catalog footprint

What is connected

10works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk

Language is the principal tool for human communication, in which humor is one of the most attractive parts. Producing natural language like humans using computers, a.k.a, Natural Language Generation (NLG), has been widely used for dialogue systems, chatbots, machine translation, as well as computer-aid creation e.g., idea generations, scriptwriting. However, the humor aspect of natural language is relatively under-investigated, especially in the age of pre-trained language models. In this work, we aim to preliminarily test whether NLG can generate humor as humans do. We build a new dataset consisting of numerous digitized Chinese Comical Crosstalk scripts (called C$^3$ in short), which is for a popular Chinese performing art called `Xiangsheng' since 1800s. (For convenience for non-Chinese speakers, we called `crosstalk' for `Xiangsheng' in this paper.) We benchmark various generation approaches including training-from-scratch Seq2seq, fine-tuned middle-scale PLMs, and large-scale PLMs (with and without fine-tuning). Moreover, we also conduct a human assessment, showing that 1) large-scale pretraining largely improves crosstalk generation quality; and 2) even the scripts generated from the best PLM is far from what we expect, with only 65% quality of human-created crosstalk. We conclude, humor generation could be largely improved using large-scaled PLMs, but it is still in its infancy. The data and benchmarking code is publicly available in \url{https://github.com/anonNo2/crosstalk-generation}.

preprint2020arXiv

Empirical Evaluation of Multi-task Learning in Deep Neural Networks for Natural Language Processing

Multi-Task Learning (MTL) aims at boosting the overall performance of each individual task by leveraging useful information contained in multiple related tasks. It has shown great success in natural language processing (NLP). Currently, a number of MLT architectures and learning mechanisms have been proposed for various NLP tasks. However, there is no systematic exploration and comparison of different MLT architectures and learning mechanisms for their strong performance in-depth. In this paper, we conduct a thorough examination of typical MTL methods on a broad range of representative NLP tasks. Our primary goal is to understand the merits and demerits of existing MTL methods in NLP tasks, thus devising new hybrid architectures intended to combine their strengths.

preprint2020arXiv

Multivariate Functional Regression via Nested Reduced-Rank Regularization

We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functional model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.

preprint2020arXiv

Multivariate Log-Contrast Regression with Sub-Compositional Predictors: Testing the Association Between Preterm Infants' Gut Microbiome and Neurobehavioral Outcomes

The so-called gut-brain axis has stimulated extensive research on microbiomes. One focus is to assess the association between certain clinical outcomes and the relative abundances of gut microbes, which can be presented as sub-compositional data in conformity with the taxonomic hierarchy of bacteria. Motivated by a study for identifying the microbes in the gut microbiome of preterm infants that impact their later neurobehavioral outcomes, we formulate a constrained integrative multi-view regression, where the neurobehavioral scores form multivariate response, the sub-compositional microbiome data form multi-view feature matrices, and a set of linear constraints on their corresponding sub-coefficient matrices ensures the conformity to the simplex geometry. To enable joint selection and inference of sub-compositions/views, we assume all the sub-coefficient matrices are possibly of low-rank, i.e., the outcomes are associated with the microbiome through different sets of latent sub-compositional factors from different taxa. We propose a scaled composite nuclear norm penalization approach for model estimation and develop a hypothesis testing procedure through de-biasing to assess the significance of different views. Simulation studies confirm the effectiveness of the proposed procedure. In the preterm infant study, the identified microbes are mostly consistent with existing studies and biological understandings. Our approach supports that stressful early life experiences imprint gut microbiome through the regulation of the gut-brain axis.

preprint2019arXiv

Surface tension of the horizon

The idea of treating the horizon of a black hole as a stretched membrane with surface tension has a long history. In this work, we discuss the microscopic origin of the surface tension of the horizon in quantum pictures of spaces, which are Bose-Einstein condensates of gravitons. The horizon is a phase interface of gravitons, the surface tension of which is found to be a result of the difference in the strength of the interaction between the gravitons on its two sides. The gravitational source, such as a Schwarzschild black hole, creates a transitional zone by changing the energy and distribution of its surrounding gravitons. Archimedes' principle for gravity can be expressed as follows: "the gravity on an object is equal to the weight of the gravitons that it displaces."

preprint2016arXiv

Aquaporin-1 can work as a Maxwell's Demon in the Body

Aquaporin-1 (AQP1) is a membrane protein which is selectively permeable to water. Due to its dumbbell shape, AQP1 can sense the size information of solute molecules in osmosis. At the cost of consuming this information, AQP1 can move water against its chemical potential gradient: it is able to work as one kind of Maxwell's Demon. This effect was detected quantitatively by measuring the water osmosis of mice red blood cells. This ability may protect the red blood cells from the eryptosis elicited by osmotic shock when they move in the kidney, where a large gradient of urea is required for the urine concentrating mechanism. This finding anticipates a new beginning of inquiries into the complicated relationships among mass, energy and information in bio-systems.

preprint2016arXiv

Carnot's theorem and Szilárd engine

In this work, the relationship between Carnot engine and Szilárd engine was discussed. By defining the available information about the temperature difference between two heat reservoirs, the Carnot engine was found to have a same physical essence with Szilárd engine: lossless conversion of available information. Thus, a generalized Carnot's theorem for wider scope of application can be described as "all the available information is 100% coded into work".

preprint2016arXiv

Maxwell's demon and information channel width of a black hole

Using a new generalized second law of thermodynamics, the information and entropy of a black hole and its accretion disk are analyzed respectively. We find the bound of the information channel width of a black hole, which is determined by the variation rate of the horizon temperature and the mass of the black hole's shell(the accretion disk close to the horizon).

preprint2016arXiv

Modified Kedem-Katchalsky equations for osmosis through nano-pore

This work presents a modified Kedem-Katchalsky equations for osmosis through nano-pore. osmotic reflection coefficient of a solute was found to be chiefly affected by the entrance of the pore while filtration reflection coefficient can be affected by both the entrance and the internal structure of the pore. Using an analytical method, we get the quantitative relationship between osmotic reflection coefficient and the molecule size. The model is verified by comparing the theoretical results with the reported experimental data of aquaporin osmosis. Our work is expected to pave the way for a better understanding of osmosis in bio-system and to give us new ideas in designing new membranes with better performance.

preprint2014arXiv

EPR Paradox and Magician's Props

Local realism has been knocked down by the experiments with entangled pairs of particles based on Bell's theorem(J. S. Bell, Physics (Long Island City, N.Y.) 1, 195 (1964)). However, there has been continuing debate on whether locality or realism is the problem. In this work, we analyzed the Einstein-Podolsky-Rosen thought experiment of Bohm's version using information theory and thermodynamics. The inference of non-locality from EPR experiments will be against the principle of non-realism of quantum mechanics. Therefore, the experiments about quantum entanglement cannot provide any proof to accuse locality.

Xiaokang Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk

Empirical Evaluation of Multi-task Learning in Deep Neural Networks for Natural Language Processing

Multivariate Functional Regression via Nested Reduced-Rank Regularization

Multivariate Log-Contrast Regression with Sub-Compositional Predictors: Testing the Association Between Preterm Infants' Gut Microbiome and Neurobehavioral Outcomes

Surface tension of the horizon

Aquaporin-1 can work as a Maxwell's Demon in the Body

Carnot's theorem and Szilárd engine

Maxwell's demon and information channel width of a black hole

Modified Kedem-Katchalsky equations for osmosis through nano-pore

EPR Paradox and Magician's Props