Researcher profile

Hao Liu

Hao Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs

The screenplay serves as the foundation for television production, defining narrative structure, character development, and dialogue. While Large Language Models (LLMs) show great potential in creative writing, direct end-to-end generation approaches often fail to produce well-crafted screenplays. We argue this failure stems from forcing a single model to simultaneously master two disparate capabilities: creative narrative construction and rigid format adherence. The resulting outputs may mimic superficial style but lack the deep structural integrity and storytelling substance required for professional use. To enable LLMs to generate high-quality screenplays, we introduce Dual-Stage Refinement (DSR), a decomposed framework that decouples creative narrative generation from format conversion. The first stage transforms a brief outline into rich, novel-style prose. The second stage refines this narrative into a professionally formatted screenplay. This separation enables the model to specialize in one distinct capability at each stage. A key challenge in implementing DSR is the scarcity of paired outline-to-novel training data. We address this through hybrid data synthesis: reverse synthesis deconstructs existing screenplays into structured inputs, while forward synthesis leverages these inputs to generate high-quality narrative texts as training targets. Blind evaluations by professional screenwriters show that DSR achieves a 75% win rate against strong baselines like Gemini-2.5-Pro and reaches 82.7% of human-level performance. Our work demonstrates that decomposed generation architecture with tailored data synthesis effectively specializes LLMs in complex creative domains.

preprint2026arXiv

CAMPA: Efficient and Aligned Multimodal Graph Learning via Decoupled Propagation and Aggregation

Multimodal Graph Neural Networks (MGNNs) have shown strong potential for learning from multimodal attributed graphs, yet most existing approaches rely on tightly coupled architectures that suffer from prohibitive computational overhead. In this paper, we present a systematic empirical analysis showing that decoupled MGNNs are substantially more efficient and scalable for large-scale graph learning. However, we identify a critical bottleneck in existing decoupled pipelines, namely modal conflict, which arises in both the propagation and aggregation stages. Specifically, independent multi-hop diffusion causes cross-modal semantic divergence during propagation, while naive fusion fails to align multi-hop feature trajectories during aggregation, jointly limiting effective representation learning. To address this challenge, we propose CAMPA, a Cross-modal Aligned Multimodal Propagation & Aggregation framework for decoupled multimodal graph learning. Concretely, CAMPA introduces a two-stage alignment mechanism: (1) cross-modal aligned propagation, which injects cross-modal similarity priors into message passing to preserve semantic consistency without additional parameter overhead; (2) trajectory aligned aggregation, which leverages trajectory-level self-attention and cross-attention to capture and align long-range dependencies across modalities and hops. Extensive experiments on diverse benchmark datasets and tasks demonstrate that CAMPA consistently outperforms strong coupled and decoupled baselines while preserving the efficiency advantages of the decoupled paradigm.

preprint2026arXiv

Electronic Nematicity Revealed by Polarized Ultrafast Spectroscopy in Bilayer La$_3$Ni$_2$O$_7$

We report a polarized ultrafast pump-probe study of the normal-state electronic dynamics in bilayer La$_3$Ni$_2$O$_7$ and trilayer La$_4$Ni$_3$O$_{10}$ single crystals at ambient pressure. While both nickelates exhibit density-wave (DW) transitions accompanied by the opening of a quasiparticle relaxation bottleneck, their electronic responses display strikingly different symmetry properties. La$_4$Ni$_3$O$_{10}$ maintains an isotropic optical response across the entire temperature range. In contrast, La$_3$Ni$_2$O$_7$ exhibits a pronounced twofold ($C_2$) anisotropy in its low-temperature electronic dynamics. This electronic nematicity, evident in both the relaxation dynamics and the effective gap scales, competes with a secondary isotropic order emerging below 115 K. The presence of macroscopic electronic anisotropy in the bilayer system, and its absence in the trilayer system, suggests an intimate relation between electronic nematic fluctuations and superconducting pairing in La$_3$Ni$_2$O$_7$ that worth for deeper explorations.

preprint2026arXiv

MIST: Towards Multi-dimensional Implicit BiaS Evaluation of LLMs for Theory of Mind

Theory of Mind (ToM) in Large Language Models (LLMs) refers to the model's ability to infer the mental states of others, with failures in this ability often manifesting as systemic implicit biases. Assessing this challenge is difficult, as traditional direct inquiry methods are often met with refusal to answer and fail to capture its subtle and multidimensional nature. Therefore, we propose MIST, which reconceptualizes the content model of stereotypes into multidimensional failures of ToM, specifically in the domains of competence, sociability, and morality. The framework introduces two indirect tasks. The Word Association Bias Test (WABT) assesses implicit lexical associations, while the Affective Attribution Test (AAT) measures implicit emotional tendencies, aiming to uncover latent stereotypes without triggering model avoidance. Through extensive experimentation on eight state-of-the-art LLMs, our framework demonstrates the ability to reveal complex bias structures and improved robustness. All data and code will be released.

preprint2026arXiv

MixTTE: Multi-Level Mixture-of-Experts for Scalable and Adaptive Travel Time Estimation

Accurate Travel Time Estimation (TTE) is critical for ride-hailing platforms, where errors directly impact user experience and operational efficiency. While existing production systems excel at holistic route-level dependency modeling, they struggle to capture city-scale traffic dynamics and long-tail scenarios, leading to unreliable predictions in large urban networks. In this paper, we propose \model, a scalable and adaptive framework that synergistically integrates link-level modeling with industrial route-level TTE systems. Specifically, we propose a spatio-temporal external attention module to capture global traffic dynamic dependencies across million-scale road networks efficiently. Moreover, we construct a stabilized graph mixture-of-experts network to handle heterogeneous traffic patterns while maintaining inference efficiency. Furthermore, an asynchronous incremental learning strategy is tailored to enable real-time and stable adaptation to dynamic traffic distribution shifts. Experiments on real-world datasets validate MixTTE significantly reduces prediction errors compared to seven baselines. MixTTE has been deployed in DiDi, substantially improving the accuracy and stability of the TTE service.

preprint2026arXiv

On the forward self-similar solutions to the two-dimensional Navier-Stokes equations

We establish the global existence of forward self-similar solutions to the two-dimensional incompressible Navier-Stokes equations for any divergence-free initial velocity that is homogeneous of degree $-1$ and locally Hölder continuous. This result requires no smallness assumption on the initial data. In sharp contrast to the three-dimensional case, where $(-1)$-homogeneous vector fields are locally square-integrable, the major difficulty for the 2D problem is the criticality in the sense that the initial kinetic energy is locally infinite at the origin, and the initial vorticity fails to be locally integrable, so that the classical local energy estimates are not available. Our key ideas are to decompose the solution into a linear part solving the heat equation and a finite-energy perturbation part, and to exploit a kind of inherent cancellation relation between the linear part and the perturbation part. These, together with suitable choices of multipliers, enable us to control the interaction terms and to establish the $H^1$-estimates for the perturbation part. Furthermore, we can get an optimal pointwise estimate via investigating the corresponding Leray equations in weighted Sobolev spaces.This gives the faster decay of the perturbation part at infinity and compactness, which play important roles in proving the existence of global-in-time self-similar solutions.

preprint2026arXiv

Practical Scaling Laws: Converting Compute into Performance in a Data-Constrained World

The scaling laws guiding modern model training were calibrated for a single regime: data-rich, single-epoch pretraining. The dominant such scaling law form, Chinchilla's $L = E + A/N^α+ B/D^β$, has three structural limitations outside that regime: it diverges as unique data shrinks instead of saturating at the uninformed baseline; it cannot represent overfitting when capacity exceeds the data; and it conflates total examples seen with unique examples available. We propose a closed-form extension, $L(N, D, T) = E + (L_0 - E)\,h/(1+h)$ with $h = a/N^α+ b/T^β+ c\,N^γ/D^δ$, that decomposes loss into undercapacity, undertraining, and overfitting terms. It saturates between the irreducible loss $E$ and an uninformed baseline $L_0$ fixed by the loss type, and reduces to Chinchilla in the data-rich, single-epoch limit. We validate it on four multi-epoch experiments spanning four architecture families (MLPs, ResNets, Fourier neural operators, and transformers) across vision, scientific ML, and language domains, and refit it to five published LLM scaling-law grids. Extrapolating to higher compute and larger unique data than seen at fit time, our form achieves state-of-the-art RMSE on every published LLM grid we evaluate and on most cells of our constructed experiments. Once calibrated, the form admits a cost-aware allocation that recovers Chinchilla's optimum when data is free and shifts toward smaller corpora and more epochs as data grows expensive.

preprint2026arXiv

Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling

While Large Language Models (LLMs) can generate fluent text, producing high-quality creative stories remains challenging. Reinforcement Learning (RL) offers a promising solution but faces two critical obstacles: designing reliable reward signals for subjective storytelling quality and mitigating training instability. This paper introduces the Reinforcement Learning for Creative Storytelling (RLCS) framework to systematically address both challenges. First, we develop a Generative Reward Model (GenRM) that provides multi-dimensional analysis and explicit reasoning about story preferences, trained through supervised fine-tuning on demonstrations with reasoning chains distilled from strong teacher models, followed by GRPO-based refinement on expanded preference data. Second, we introduce an entropy-based reward shaping strategy that dynamically prioritizes learning on confident errors and uncertain correct predictions, preventing overfitting on already-mastered patterns. Experiments demonstrate that GenRM achieves 68\% alignment with human creativity judgments, and RLCS significantly outperforms strong baselines including Gemini-2.5-Pro in overall story quality. This work provides a practical pipeline for applying RL to creative domains, effectively navigating the dual challenges of reward modeling and training stability.

preprint2026arXiv

The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge

This paper summarizes the ICASSP 2026 Automatic Song Aesthetics Evaluation (ASAE) Challenge, which focuses on predicting the subjective aesthetic scores of AI-generated songs. The challenge consists of two tracks: Track 1 targets the prediction of the overall musicality score, while Track 2 focuses on predicting five fine-grained aesthetic scores. The challenge attracted strong interest from the research community and received numerous submissions from both academia and industry. Top-performing systems significantly surpassed the official baseline, demonstrating substantial progress in aligning objective metrics with human aesthetic preferences. The outcomes establish a standardized benchmark and advance human-aligned evaluation methodologies for modern music generation systems.

preprint2025arXiv

TTC: Transformer-based TDE Classifier for the Wide Field Survey Telescope (WFST)

We propose the Transformer-based Tidal disruption events (TDE) Classifier (\texttt{TTC}), specifically designed to operate effectively with both real-time alert streams and archival data of the Wide Field Survey Telescope (WFST). It aims to minimize the reliance on external catalogs and find TDE candidates from pure light curves, which is more suitable for finding TDEs in faint and distant galaxies. \texttt{TTC} consists of two key modules that can work independently: (1) A light curve parametric fitting module and (2) a Transformer (\texttt{Mgformer})-based classification network. The training of the latter module and evaluation for each module utilize a light curve dataset of 7413 spectroscopically classified transients from the Zwicky Transient Facility (ZTF). The \texttt{Mgformer}-based module is superior in performance and flexibility. Its representative recall and precision values are 0.79 and 0.76, respectively, and can be modified by adjusting the threshold. It can also efficiently find TDE candidates within 30 days from the first detection. For comparison, the parametric fitting module yields values of 0.72 and 0.40, respectively, while it is $>$10 times faster in average speed. Hence, the setup of modules allows a trade-off between performance and time, as well as precision and recall. \texttt{TTC} has successfully picked out all spectroscopically identified TDEs among ZTF transients in a real-time classification test, and selected $\sim$20 TDE candidates in the deep field survey data of WFST. The discovery rate will greatly increase once the differential database for the wide field survey is ready.