Source author record

Lili Yu

Lili Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language cond-mat.mtrl-sci Machine Learning cond-mat.mes-hall Artificial Intelligence Human-Computer Interaction Information Retrieval physics.soc-ph

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Scaling Laws for Generative Mixed-Modal Language Models

Generative language models define distributions over sequences of tokens that can represent essentially any combination of data modalities (e.g., any permutation of image tokens from VQ-VAEs, speech tokens from HuBERT, BPE tokens for language or code, and so on). To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens. We report new mixed-modal scaling laws that unify the contributions of individual modalities and the interactions between them. Specifically, we explicitly model the optimal synergy and competition due to data and model size as an additive term to previous uni-modal scaling laws. We also find four empirical phenomena observed during the training, such as emergent coordinate-ascent style training that naturally alternates between modalities, guidelines for selecting critical hyper-parameters, and connections between mixed-modal competition and training stability. Finally, we test our scaling law by training a 30B speech-text model, which significantly outperforms the corresponding unimodal models. Overall, our research provides valuable insights into the design and training of mixed-modal generative models, an important new class of unified models that have unique distributional properties.

preprint2020arXiv

Interactive Classification by Asking Informative Questions

We study the potential for interaction in natural language classification. We add a limited form of interaction for intent classification, where users provide an initial query using natural language, and the system asks for additional information using binary or multi-choice questions. At each turn, our system decides between asking the most informative question or making the final classification prediction.The simplicity of the model allows for bootstrapping of the system without interaction data, instead relying on simple crowdsourcing tasks. We evaluate our approach on two domains, showing the benefit of interaction and the advantage of learning to balance between asking additional questions and making the final prediction.

preprint2020arXiv

On the evolution of word usage of classical Chinese poetry

The hierarchy of classical Chinese poetry has been broadly acknowledged by a number of studies in Chinese literature. However, quantitative investigations about the evolutionary linkages of classical Chinese poetry are limited. The primary goal of this study is to provide quantitative evidence of the evolutionary linkages, with emphasis on character usage, among different period genres of classical Chinese poetry. Specifically, various statistical analyses are performed to find and compare the patterns of character usage in the poems of nine period genres, including shi jing, chu ci, Han shi , Jin shi, Tang shi, Song shi, Yuan shi, Ming shi, and Qing shi. The result of analysis indicates that each of nine period genres has unique patterns of character usage, with some Chinese characters that are preferably used in the poems of a particular period genre. The analysis on the general pattern of character preference implies a decreasing trend in the use of Chinese characters that rarely occur in modern Chinese literature along the timeline of dynastic types of classical Chinese poetry. The phylogenetic analysis based on the distance matrix suggests that the evolutionary linkages of different types of classical Chinese poetry are congruent with their chronological order, suggesting that character frequencies contain phylogenetic information that is useful for inferring evolutionary linkages among various types of classical Chinese poetry. The estimated phylogenetic tree identifies four groups (shi jing, chu ci), (Han shi, Jin shi), (Tang shi, Song shi, Yuan shi), and (Ming shi, Qing shi). The statistical analyses conducted in this study can be generalized to analyze the data sets of general Chinese literature. Such analyses can provide quantitative insights about the evolutionary linkages of general Chinese literature.

preprint2020arXiv

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs. However, directly applying OT often produces dense and therefore uninterpretable alignments. To overcome this limitation, we introduce novel constrained variants of the OT problem that result in highly sparse alignments with controllable sparsity. Our model is end-to-end differentiable using the Sinkhorn algorithm for OT and can be trained without any alignment annotations. We evaluate our model on the StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very sparse rationale selections with high fidelity while preserving prediction accuracy compared to strong attention baseline models.

preprint2015arXiv

Parallel Stitching of Two-Dimensional Materials

Diverse parallel stitched two-dimensional heterostructures are synthesized, including metal-semiconductor (graphene-MoS2), semiconductor-semiconductor (WS2-MoS2), and insulator-semiconductor (hBN-MoS2), directly through selective sowing of aromatic molecules as the seeds in chemical vapor deposition (CVD) method. Our methodology enables the large-scale fabrication of lateral heterostructures with arbitrary patterns, and clean and precisely aligned interfaces, which offers tremendous potential for its application in integrated circuits.

preprint2013arXiv

Large-scale 2D Electronics based on Single-layer MoS2 Grown by Chemical Vapor Deposition

2D nanoelectronics based on single-layer MoS2 offers great advantages for both conventional and ubiquitous applications. This paper discusses the large-scale CVD growth of single-layer MoS2 and fabrication of devices and circuits for the first time. Both digital and analog circuits are fabricated to demonstrate its capability for mixed-signal applications.

preprint2012arXiv

Integrated Circuits Based on Bilayer MoS2 Transistors

Two-dimensional (2D) materials, such as molybdenum disulfide (MoS2), have been shown to exhibit excellent electrical and optical properties. The semiconducting nature of MoS2 allows it to overcome the shortcomings of zero-bandgap graphene, while still sharing many of graphene's advantages for electronic and optoelectronic applications. Discrete electronic and optoelectronic components, such as field-effect transistors, sensors and photodetectors made from few-layer MoS2 show promising performance as potential substitute of Si in conventional electronics and of organic and amorphous Si semiconductors in ubiquitous systems and display applications. An important next step is the fabrication of fully integrated multi-stage circuits and logic building blocks on MoS2 to demonstrate its capability for complex digital logic and high-frequency ac applications. This paper demonstrates an inverter, a NAND gate, a static random access memory, and a five-stage ring oscillator based on a direct-coupled transistor logic technology. The circuits comprise between two to twelve transistors seamlessly integrated side-by-side on a single sheet of bilayer MoS2. Both enhancement-mode and depletion-mode transistors were fabricated thanks to the use of gate metals with different work functions.

preprint2010arXiv

Functionalized Graphene for High Performance Two-dimensional Spintronics Devices

Using first-principles calculations, we explore the possibility of functionalized graphene as high performance two-dimensional spintronics device. Graphene functionalized with O on one side and H on the other side in the chair conformation is found to be a ferromagnetic metal with a spin-filter efficiency up to 85% at finite bias. The ground state of graphene semi-functionalized with F in the chair conformation is an antiferromagnetic semiconductor, and we construct a magnetoresistive device from it by introducing a magnetic field to stabilize its ferromagnetic metallic state. The resulting room-temperature magnetoresistance is up to 5400%, which is one order of magnitude larger than the available experimental values.

Lili Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Scaling Laws for Generative Mixed-Modal Language Models

Interactive Classification by Asking Informative Questions

On the evolution of word usage of classical Chinese poetry

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

Parallel Stitching of Two-Dimensional Materials

Large-scale 2D Electronics based on Single-layer MoS2 Grown by Chemical Vapor Deposition

Integrated Circuits Based on Bilayer MoS2 Transistors

Functionalized Graphene for High Performance Two-dimensional Spintronics Devices