Source author record

Jie Lian

Jie Lian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Sound cond-mat.mtrl-sci Artificial Intelligence cond-mat.mes-hall eess.AS

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction

This paper presents a unified spoken language model for emotional intelligence, enhanced by a novel data construction strategy termed Injected Emotional-Attribution Thinking (IEAT). IEAT incorporates user emotional states and their underlying causes into the model's internal reasoning process, enabling emotion-aware reasoning to be internalized rather than treated as explicit supervision. The model is trained with a two-stage progressive strategy. The first stage performs speech-text alignment and emotional attribute modeling via self-distillation, while the second stage conducts end-to-end cross-modal joint optimization to ensure consistency between textual and spoken emotional expressions. Experiments on the Human-like Spoken Dialogue Systems Challenge (HumDial) Emotional Intelligence benchmark demonstrate that the proposed approach achieves top-ranked performance across emotional trajectory modeling, emotional reasoning, and empathetic response generation under both LLM-based and human evaluations.

preprint2026arXiv

TELEVAL: A Dynamic Benchmark Designed for Spoken Language Models in Chinese Interactive Scenarios

Spoken language models (SLMs) have advanced rapidly in recent years, accompanied by a growing number of evaluation benchmarks. However, most existing benchmarks emphasize task completion and capability scaling, while remaining poorly aligned with how users interact with SLMs in real-world spoken conversations. Effective spoken interaction requires not only accurate understanding of user intent and content, but also the ability to respond with appropriate interactional strategies. In this paper, we present TELEVAL, a dynamic, user-centered benchmark for evaluating SLMs in realistic Chinese spoken interaction scenarios. TELEVAL consolidates evaluation into two core aspects. Reliable Content Fulfillment assesses whether models can comprehend spoken inputs and produce semantically correct responses. Interactional Appropriateness evaluates whether models act as socially capable interlocutors, requiring them not only to generate human-like, colloquial responses, but also to implicitly incorporate paralinguistic cues for natural interaction. Experiments reveal that, despite strong performance on semantic and knowledge-oriented tasks, current SLMs still struggle to produce natural and interactionally appropriate responses, highlighting the need for more interaction-faithful evaluation.

preprint2026arXiv

Tibetan-TTS:Low-Resource Tibetan Speech Synthesis with Large Model Adaptation

Tibetan text-to-speech (TTS) has long been challenged by scarce speech resources, significant dialectal variation, and the complex mapping between written text and spoken pronunciation. To address these issues, this work presents, to the best of our knowledge, the first large-model-based Tibetan TTS system in the industry, built upon a large speech synthesis model developed by Xingchen AGI Lab. The proposed system integrates data quality enhancement, Tibetan-oriented text representation and tokenizer adaptation, and cross-lingual adaptive training for low-resource Tibetan speech synthesis. Experimental results show that the system can generate stable, natural, and intelligible Tibetan speech under low-resource conditions. In subjective evaluation, the MOS scores of the syllable-level and BPE-based systems reach 4.28 and 4.35, while their pronunciation accuracies reach 97.6% and 96.6%, respectively, outperforming an external commercial Tibetan TTS interface. These results demonstrate that combining a large-model backbone with Tibetan-oriented text representation adaptation and cross-lingual adaptive training enables highly usable low-resource Tibetan speech synthesis, and also provides a technical foundation for future unified multi-dialect Tibetan speech synthesis.

preprint2015arXiv

Two-Dimensional Van der Waals Epitaxy Kinetics in a Three-Dimensional Perovskite Halide

The exploration of emerging materials physics and prospective applications of two-dimensional materials greatly relies on the growth control of their thickness, phases, morphologies and film-substrate interactions. Though substantial progresses have been made for the development of two-dimensional films from conventional layered bulky materials, particular challenges remain on obtaining ultrathin, single crystalline, dislocation-free films from intrinsically non-Van der Waals-type three-dimensional materials. In this report, with the successful demonstration of single crystalline ultrathin large scale perovskite halide material, we reveal and identify the favorable role of weak Van der Waals film-substrate interaction on the nucleation and growth of the two-dimensional morphology out of non-layered materials compared to conventional epitaxy. We also show how the bonding nature of the three-dimensional material itself affects the kinetic energy landscape of ultrathin films growth. By studying the formation of fractal perovskites assisted with Monte Carlo simulations, we demonstrate that the competition between the Van der Waals diffusion and surface free energy of the perovskite leads to film thickening, suggesting extra strategies such as surface passivation may be needed for the growth of monolayer and a few layers films.

preprint2010arXiv

Large-scale Graphitic Thin Films Synthesized on Ni and Transferred to Insulators: Structural and Electronic Properties

We present a comprehensive study of the structural and electronic properties of ultrathin films containing graphene layers synthesized by chemical vapor deposition (CVD) based surface segregation on polycrystalline Ni foils then transferred onto insulating SiO2/Si substrates. Films of size up to several mm's have been synthesized. Structural characterizations by atomic force microscopy (AFM), scanning tunneling microscopy (STM), cross-sectional transmission electron microscopy (XTEM) and Raman spectroscopy confirm that such large scale graphitic thin films (GTF) contain both thick graphite regions and thin regions of few layer graphene. The films also contain many wrinkles, with sharply-bent tips and dislocations revealed by XTEM, yielding insights on the growth and buckling processes of the GTF. Measurements on mm-scale back-gated transistor devices fabricated from the transferred GTF show ambipolar field effect with resistance modulation ~50% and carrier mobilities reaching ~2000 cm^2/Vs. We also demonstrate quantum transport of carriers with phase coherence length over 0.2 $μ$m from the observation of 2D weak localization in low temperature magneto-transport measurements. Our results show that despite the non-uniformity and surface roughness, such large-scale, flexible thin films can have electronic properties promising for device applications.

Jie Lian

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction

TELEVAL: A Dynamic Benchmark Designed for Spoken Language Models in Chinese Interactive Scenarios

Tibetan-TTS:Low-Resource Tibetan Speech Synthesis with Large Model Adaptation

Two-Dimensional Van der Waals Epitaxy Kinetics in a Three-Dimensional Perovskite Halide

Large-scale Graphitic Thin Films Synthesized on Ni and Transferred to Insulators: Structural and Electronic Properties