Researcher profile

Zuo-Jun Max Shen

Zuo-Jun Max Shen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

ORPR: An OR-Guided Pretrain-then-Reinforce Learning Model for Inventory Management

As the pursuit of synergy between Artificial Intelligence (AI) and Operations Research (OR) gains momentum in handling complex inventory systems, a critical challenge persists: how to effectively reconcile AI's adaptive perception with OR's structural rigor. To bridge this gap, we propose a novel OR-Guided "Pretrain-then-Reinforce" framework. To provide structured guidance, we propose a simulation-augmented OR model that generates high-quality reference decisions, implicitly capturing complex business constraints and managerial preferences. Leveraging these OR-derived decisions as foundational training labels, we design a domain-informed deep learning foundation model to establish foundational decision-making capabilities, followed by a reinforcement learning (RL) fine-tuning stage. Uniquely, we position RL as a deep alignment mechanism that enables the AI agent to internalize the optimality principles of OR, while simultaneously leveraging exploration for general policy refinement and allowing expert guidance for scenario-specific adaptation (e.g., promotional events). Validated through extensive numerical experiments and a field deployment at JD.com augmented by a Difference-in-Differences (DiD) analysis, our model significantly outperforms incumbent industrial practices, delivering real-world gains of a 5.27-day reduction in turnover and a 2.29% increase in in-stock rates, alongside a 29.95% decrease in holding costs. Contrary to the prevailing trend of brute-force model scaling, our study demonstrates that a lightweight, domain-informed model can deliver state-of-the-art performance and robust transferability when guided by structured OR logic. This approach offers a scalable and cost-effective paradigm for intelligent supply chain management, highlighting the value of deeply aligning AI with OR.

preprint2026arXiv

Rethinking Supply Chain Planning: A Generative Paradigm

Supply chain planning is the critical process of anticipating future demand and coordinating operational activities across the logistics network. However, within the context of contemporary e-commerce, traditional planning paradigms, typically characterized by fragmented processes and static optimization, prove inadequate in addressing dynamic demand, organizational silos, and the complexity of multi-stage coordination. To address these challenges, this study proposes a fundamental rethinking of supply chain planning, redefining it not merely as a computational task, but as an interactive, integrated, and automated cognitive process. This new paradigm emphasizes the organic unification of human strategic intent with adaptive execution, shifting the focus from rigid control to continuous, intelligent orchestration. To operationalize this conceptual shift, we introduce a Generative AI-powered agentic framework. Functioning as an intelligent cognitive interface, this framework bridges the gap between unstructured business contexts and structured analytical workflows, enabling the system to comprehend complex semantics and coordinate decisions across organizational boundaries. We demonstrate the empirical validity of this approach within JD.com's large-scale operations. The deployment confirms the efficacy of this cognitive paradigm, yielding an approximate 22% improvement in planning accuracy and a 2% increase in in-stock rates, thereby validating the transformation of planning into an adaptive, knowledge-driven capability.

preprint2023arXiv

Evaluation of Public Transit Systems under Short Random Service Suspensions: A Bulk-Service Queuing Approach

This paper proposes a stochastic framework to evaluate the performance of public transit systems under short random service suspensions. We aim to derive closed-form formulations of the mean and variance of the queue length and waiting time. A bulk-service queue model is adopted to formulate the queuing behavior in the system. The random service suspension is modeled as a two-state (disruption and normal) Markov process. We prove that headway is distributed as the difference between two compound Poisson exponential random variables. The distribution is used to specify the mean and variance of queue length and waiting time at each station with analytical formulations. The closed-form stability condition of the system is also derived, implying that the system is more likely to be unstable with high incident rates and long incident duration. The proposed model is implemented on a bus network. Results show that higher incident rates and higher average incident duration will increase both the mean and variance of queue length and waiting time, which are consistent with the theoretical analysis. Crowding stations are more vulnerable to random service suspensions. The theoretical results are validated with a simulation model, showing consistency between the two outcomes.

preprint2022arXiv

Optimal Policy for Inventory Management with Periodic and Controlled Resets

Inventory management problems with periodic and controllable resets occur in the context of managing water storage in the developing world and retailing limited-time availability products. In this paper, we consider a set of sequential decision problems in which the decision-maker must not only balance holding and shortage costs but discard all inventory before a fixed number of decision epochs, with the option for an early inventory reset. Finding optimal policies using dynamic programming for these problems is particularly challenging since the resulting value functions are non-convex. Moreover, this structure cannot be easily analyzed using existing extended definitions, such as $K$-convexity. Our key contribution is to present sufficient conditions that ensure the optimal policy has an easily interpretable structure that generalizes the well-known $(s, S)$ policy from the operations literature. Furthermore, we demonstrate that the optimal policy has a four-threshold structure under these rather mild conditions. We then conclude with computational experiments, thereby illustrating the policy structures that can be extracted in several inventory management scenarios.

preprint2022arXiv

Quantum Computing Methods for Supply Chain Management

Quantum computing is expected to have transformative influences on many domains, but its practical deployments on industry problems are underexplored. We focus on applying quantum computing to operations management problems in industry, and in particular, supply chain management. Many problems in supply chain management involve large state and action spaces and pose computational challenges on classic computers. We develop a quantized policy iteration algorithm to solve an inventory control problem and demonstrative its effectiveness. We also discuss in-depth the hardware requirements and potential challenges on implementing this quantum algorithm in the near term. Our simulations and experiments are powered by \texttt{IBM Qiskit} and the \texttt{qBraid} system.

preprint2020arXiv

3-D Dynamic UAV Base Station Location Problem

We address a dynamic covering location problem of an Unmanned Aerial Vehicle Base Station (UAV-BS), where the location sequence of a single UAV-BS in a wireless communication network is determined to satisfy data demand arising from ground users. This problem is especially relevant in the context of smart grid and disaster relief. The vertical movement ability of the UAV-BS and non-convex covering functions in wireless communication restrict utilizing classical planar covering location approaches. Therefore, we develop new formulations to this emerging problem for a finite time horizon to maximize the total coverage. In particular, we develop a mixed-integer non-linear programming formulation which is non-convex in nature, and propose a Lagrangean Decomposition Algorithm (LDA) to solve this formulation. Due to high complexity of the problem, the LDA is still unable to find good local solutions to large-scale problems. Therefore, we develop a Continuum Approximation (CA) model and show that CA would be a promising approach in terms of both computational time and solution accuracy. Our numerical study also shows that the CA model can be a remedy to build efficient initial solutions for exact solution algorithms.

preprint2020arXiv

Urban Bike Lane Planning with Bike Trajectories: Models, Algorithms, and a Real-World Case Study

We study an urban bike lane planning problem based on the fine-grained bike trajectory data, which is made available by smart city infrastructure such as bike-sharing systems. The key decision is where to build bike lanes in the existing road network. As bike-sharing systems become widespread in the metropolitan areas over the world, bike lanes are being planned and constructed by many municipal governments to promote cycling and protect cyclists. Traditional bike lane planning approaches often rely on surveys and heuristics. We develop a general and novel optimization framework to guide the bike lane planning from bike trajectories. We formalize the bike lane planning problem in view of the cyclists' utility functions and derive an integer optimization model to maximize the utility. To capture cyclists' route choices, we develop a bilevel program based on the Multinomial Logit model. We derive structural properties about the base model and prove that the Lagrangian dual of the bike lane planning model is polynomial-time solvable. Furthermore, we reformulate the route choice based planning model as a mixed integer linear program using a linear approximation scheme. We develop tractable formulations and efficient algorithms to solve the large-scale optimization problem. Via a real-world case study with a city government, we demonstrate the efficiency of the proposed algorithms and quantify the trade-off between the coverage of bike trips and continuity of bike lanes. We show how the network topology evolves according to the utility functions and highlight the importance of understanding cyclists' route choices. The proposed framework drives the data-driven urban planning scheme in smart city operations management.