Researcher profile

Liangjie Hong

Liangjie Hong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation

Generative recommendation treats next-item prediction as autoregressive item-identifier generation. Specifically, items are encoded as semantic identifiers (SIDs), which are short coarse-to-fine token sequences whose early tokens capture broad semantics and later tokens refine them. Recent work augments this paradigm with reasoning traces and optimizes them via reinforcement learning with verifiable rewards, typically outcome-reward algorithm with exact-match feedback on the generated SID. However, in large-catalog recommendation, exact-match feedback on the generated SID only reports whether the final item is correct; when a generated SID mismatches, outcome-reward cannot identify which SID-token prediction caused the mismatch and may penalize matched SID-token positions together with the mismatched position. We identify that the natural unit of credit assignment in this setting is a single reasoning step (one thinking block paired with one SID token). We instantiate this idea in SAPO (Step-Aligned Policy Optimization): rather than broadcasting one advantage to the whole response, SAPO computes a separate group-relative advantage for each reasoning step and applies it only to the corresponding thinking block and SID token. Across three real-world recommendation datasets, SAPO stabilizes reinforcement-learning training and consistently improves over existing generative recommendation baselines, with the largest gains where sparse exact-match feedback makes reasoning-step credit assignment important. Our results suggest that reinforcement-learning objectives for structured generation should mirror the decoder's own decomposition of the output.

preprint2022arXiv

Remote Work Optimization with Robust Multi-channel Graph Neural Networks

The spread of COVID-19 leads to the global shutdown of many corporate offices, and encourages companies to open more opportunities that allow employees to work from a remote location. As the workplace type expands from onsite offices to remote areas, an emerging challenge for an online hiring marketplace is how these remote opportunities and user intentions to work remotely can be modeled and matched without prior information. Despite the unprecedented amount of remote jobs posted amid COVID-19, there is no existing approach that can be directly applied. Introducing a brand new workplace type naturally leads to the cold-start problem, which is particularly more common for less active job seekers. It is challenging, if not impossible, to onboard a new workplace type for any predictive model if existing information sources can provide little information related to a new category of jobs, including data from resumes and job descriptions. Hence, in this work, we aim to propose a principled approach that jointly models the remoteness of job seekers and job opportunities with limited information, which also suffices the needs of web-scale applications. Existing research on the emerging type of remote workplace mainly focuses on qualitative studies, and classic predictive modeling approaches are inapplicable considering the problem of cold-start and information scarcity. We precisely try to close this gap with a novel graph neural architecture. Extensive experiments on large-scale data from real-world applications have been conducted to validate the superiority of the proposed approach over competitive baselines. The improvement may translate to more rapid onboarding of the new workplace type that can benefit job seekers who are interested in working remotely.

preprint2019arXiv

The Identification and Estimation of Direct and Indirect Effects in A/B Tests through Causal Mediation Analysis

E-commerce companies have a number of online products, such as organic search, sponsored search, and recommendation modules, to fulfill customer needs. Although each of these products provides a unique opportunity for users to interact with a portion of the overall inventory, they are all similar channels for users and compete for limited time and monetary budgets of users. To optimize users' overall experiences on an E-commerce platform, instead of understanding and improving different products separately, it is important to gain insights into the evidence that a change in one product would induce users to change their behaviors in others, which may be due to the fact that these products are functionally similar. In this paper, we introduce causal mediation analysis as a formal statistical tool to reveal the underlying causal mechanisms. Existing literature provides little guidance on cases where multiple unmeasured causally-dependent mediators exist, which are common in A/B tests. We seek a novel approach to identify in those scenarios direct and indirect effects of the treatment. In the end, we demonstrate the effectiveness of the proposed method in data from Etsy's real A/B tests and shed lights on complex relationships between different products.