Researcher profile

Shengzhong Liu

Shengzhong Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

BubbleSpec: Turning Long-Tail Bubbles into Speculative Rollout Drafts for Synchronous Reinforcement Learning

Reinforcement Learning (RL) has become a cornerstone for improving the performance of Large Language Models (LLMs). However, its rollout phase constitutes a significant efficiency bottleneck, mainly arising from the long-tail bubbles across data parallel ranks, particularly in long-context scenarios where faster GPUs remain idle while waiting for stragglers. Existing solutions, such as partial rollout or asynchronous RL, mitigate these bubbles by compromising the algorithm's strict synchronous nature. Instead, we propose BubbleSpec, a novel framework that accelerates RL rollouts while strictly keeping the mathematical exactness. Instead of attempting to eliminate bubbles, BubbleSpec exploits them. We exploit the idle time windows of faster ranks to pre-generate rollout results for subsequent steps, serving as drafts for speculative decoding. Unlike prior speculative methods that rely on historical epoch similarity and warm-ups, BubbleSpec is agnostic to dataset size and provides immediate acceleration from the onset of training. Extensive evaluations demonstrate that BubbleSpec reduces decoding steps by 50% and increases rollout throughput by up to 1.8x. Critically, BubbleSpec is seamlessly compatible with various RL frameworks and strategies as it sustains the strict synchronous property of RL algorithms.

preprint2026arXiv

Dywave: Event-Aligned Dynamic Tokenization for Heterogeneous IoT Sensing Signals

Internet of Things (IoT) systems continuously collect heterogeneous sensing signals from ubiquitous sensors to support intelligent applications such as human activity analysis, emotion monitoring, and environmental perception. These signals are inherently non-stationary and multi-scale, posing unique challenges for standard tokenization techniques. This paper proposes Dywave, a dynamic tokenization framework for IoT sensing signals that constructs compact input representations aligned with intrinsic temporal structures and underlying physical events. Dywave leverages wavelet-based hierarchical decomposition, identifies meaningful temporal boundaries corresponding to underlying semantic events, and adaptively compresses redundant intervals while preserving temporal coherence. Extensive evaluations on five real-world IoT sensing datasets across activity recognition, stress assessment, and nearby object detection demonstrate that Dywave outperforms state-of-the-art methods by up to 12% in accuracy, while improving computational efficiency by reducing input token lengths by up to 75% across mainstream sequence models. Moreover, Dywave exhibits improved robustness to domain shifts and varying sequence lengths.

preprint2022arXiv

Unsupervised Belief Representation Learning with Information-Theoretic Variational Graph Auto-Encoders

This paper develops a novel unsupervised algorithm for belief representation learning in polarized networks that (i) uncovers the latent dimensions of the underlying belief space and (ii) jointly embeds users and content items (that they interact with) into that space in a manner that facilitates a number of downstream tasks, such as stance detection, stance prediction, and ideology mapping. Inspired by total correlation in information theory, we propose the Information-Theoretic Variational Graph Auto-Encoder (InfoVGAE) that learns to project both users and content items (e.g., posts that represent user views) into an appropriate disentangled latent space. To better disentangle latent variables in that space, we develop a total correlation regularization module, a Proportional-Integral (PI) control module, and adopt rectified Gaussian distribution to ensure the orthogonality. The latent representation of users and content can then be used to quantify their ideological leaning and detect/predict their stances on issues. We evaluate the performance of the proposed InfoVGAE on three real-world datasets, of which two are collected from Twitter and one from U.S. Congress voting records. The evaluation results show that our model outperforms state-of-the-art unsupervised models by reducing 10.5% user clustering errors and achieving 12.1% higher F1 scores for stance separation of content items. In addition, InfoVGAE produces a comparable result with supervised models. We also discuss its performance on stance prediction and user ranking within ideological groups.

preprint2020arXiv

ControlVAE: Controllable Variational Autoencoder

Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning. However, the existing VAE models have some limitations in different applications. For example, a VAE easily suffers from KL vanishing in language modeling and low reconstruction quality for disentangling. To address these issues, we propose a novel controllable variational autoencoder framework, ControlVAE, that combines a controller, inspired by automatic control theory, with the basic VAE to improve the performance of resulting generative models. Specifically, we design a new non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to automatically tune the hyperparameter (weight) added in the VAE objective using the output KL-divergence as feedback during model training. The framework is evaluated using three applications; namely, language modeling, disentangled representation learning, and image generation. The results show that ControlVAE can achieve better disentangling and reconstruction quality than the existing methods. For language modelling, it not only averts the KL-vanishing, but also improves the diversity of generated text. Finally, we also demonstrate that ControlVAE improves the reconstruction quality of generated images compared to the original VAE.

preprint2020arXiv

paper2repo: GitHub Repository Recommendation for Academic Papers

GitHub has become a popular social application platform, where a large number of users post their open source projects. In particular, an increasing number of researchers release repositories of source code related to their research papers in order to attract more people to follow their work. Motivated by this trend, we describe a novel item-item cross-platform recommender system, $\textit{paper2repo}$, that recommends relevant repositories on GitHub that match a given paper in an academic search system such as Microsoft Academic. The key challenge is to identify the similarity between an input paper and its related repositories across the two platforms, $\textit{without the benefit of human labeling}$. Towards that end, paper2repo integrates text encoding and constrained graph convolutional networks (GCN) to automatically learn and map the embeddings of papers and repositories into the same space, where proximity offers the basis for recommendation. To make our method more practical in real life systems, labels used for model training are computed automatically from features of user actions on GitHub. In machine learning, such automatic labeling is often called {\em distant supervision\/}. To the authors' knowledge, this is the first distant-supervised cross-platform (paper to repository) matching system. We evaluate the performance of paper2repo on real-world data sets collected from GitHub and Microsoft Academic. Results demonstrate that it outperforms other state of the art recommendation methods.

preprint2020arXiv

Revisiting Over-smoothing in Deep GCNs

Oversmoothing has been assumed to be the major cause of performance drop in deep graph convolutional networks (GCNs). In this paper, we propose a new view that deep GCNs can actually learn to anti-oversmooth during training. This work interprets a standard GCN architecture as layerwise integration of a Multi-layer Perceptron (MLP) and graph regularization. We analyze and conclude that before training, the final representation of a deep GCN does over-smooth, however, it learns anti-oversmoothing during training. Based on the conclusion, the paper further designs a cheap but effective trick to improve GCN training. We verify our conclusions and evaluate the trick on three citation networks and further provide insights on neighborhood aggregation in GCNs.

preprint2019arXiv

Comparative studies of optoelectrical properties of prominent PV materials: Halide Perovskite, CdTe, and GaAs

We compare three representative high performance PV materials: halide perovskite MAPbI3, CdTe, and GaAs, in terms of photoluminescence (PL) efficiency, PL lineshape, carrier diffusion, and surface recombination, over multiple orders of photo-excitation density. An analytic model is used to describe the excitation density dependence of PL intensity and extract the internal PL efficiency and multiple pertinent recombination parameters. A PL imaging technique is used to obtain carrier diffusion length without using a PL quencher, thus, free of unintended influence beyond pure diffusion. Our results show that perovskite samples tend to exhibit lower Shockley-Read-Hall (SRH) recombination rate in both bulk and surface, thus higher PL efficiency than the inorganic counterparts, particularly under low excitation density, even with no or preliminary surface passivation. PL lineshape and diffusion analysis indicate that there is considerable structural disordering in the perovskite materials, and thus photo-generated carriers are not in global thermal equilibrium, which in turn suppresses the nonradiative recombination. This study suggests that relatively low point-defect density, less detrimental surface recombination, and moderate structural disordering contribute to the high PV efficiency in the perovskite. This comparative photovoltaics study provides more insights into the fundamental material science and the search for optimal device designs by learning from different technologies.