Source author record

Jinyu Zhang

Jinyu Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Information Retrieval Machine Learning Computer Vision cond-mat.mes-hall hep-ph Robotics

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Segmentation-Driven Monocular Shape from Polarization based on Physical Model

Monocular shape-from-polarization (SfP) leverages the intrinsic relationship between light polarization properties and surface geometry to recover surface normals from single-view polarized images, providing a compact and robust approach for three-dimensional (3D) reconstruction. Despite its potential, existing monocular SfP methods suffer from azimuth angle ambiguity, an inherent limitation of polarization analysis, that severely compromises reconstruction accuracy and stability. This paper introduces a novel segmentation-driven monocular SfP (SMSfP) framework that reformulates global shape recovery into a set of local reconstructions over adaptively segmented convex sub-regions. Specifically, a polarization-aided adaptive region growing (PARG) segmentation strategy is proposed to decompose the global convexity assumption into locally convex regions, effectively suppressing azimuth ambiguities and preserving surface continuity. Furthermore, a multi-scale fusion convexity prior (MFCP) constraint is developed to ensure local surface consistency and enhance the recovery of fine textural and structural details. Extensive experiments on both synthetic and real-world datasets validate the proposed approach, showing significant improvements in disambiguation accuracy and geometric fidelity compared with existing physics-based monocular SfP techniques.

preprint2026arXiv

Unified Embodied VLM Reasoning with Robotic Action via Autoregressive Discretized Pre-training

General-purpose robotic systems operating in open-world environments must achieve both broad generalization and high-precision action execution, a combination that remains challenging for existing Vision-Language-Action (VLA) models. While large Vision-Language Models (VLMs) improve semantic generalization, insufficient embodied reasoning leads to brittle behavior, and conversely, strong reasoning alone is inadequate without precise control. To provide a decoupled and quantitative assessment of this bottleneck, we introduce Embodied Reasoning Intelligence Quotient (ERIQ), a large-scale embodied reasoning benchmark in robotic manipulation, comprising 6K+ question-answer pairs across four reasoning dimensions. By decoupling reasoning from execution, ERIQ enables systematic evaluation and reveals a strong positive correlation between embodied reasoning capability and end-to-end VLA generalization. To bridge the gap from reasoning to precise execution, we propose FACT, a flow-matching-based action tokenizer that converts continuous control into discrete sequences while preserving high-fidelity trajectory reconstruction. The resulting GenieReasoner jointly optimizes reasoning and action in a unified space, outperforming both continuous-action and prior discrete-action baselines in real-world tasks. Together, ERIQ and FACT provide a principled framework for diagnosing and overcoming the reasoning-precision trade-off, advancing robust, general-purpose robotic manipulation. Project page: https://geniereasoner.github.io/GenieReasoner/

preprint2022arXiv

Reinforcement Learning-enhanced Shared-account Cross-domain Sequential Recommendation

Shared-account Cross-domain Sequential Recommendation (SCSR) is an emerging yet challenging task that simultaneously considers the shared-account and cross-domain characteristics in the sequential recommendation. Existing works on SCSR are mainly based on Recurrent Neural Network (RNN) and Graph Neural Network (GNN) but they ignore the fact that although multiple users share a single account, it is mainly occupied by one user at a time. This observation motivates us to learn a more accurate user-specific account representation by attentively focusing on its recent behaviors. Furthermore, though existing works endow lower weights to irrelevant interactions, they may still dilute the domain information and impede the cross-domain recommendation. To address the above issues, we propose a reinforcement learning-based solution, namely RL-ISN, which consists of a basic cross-domain recommender and a reinforcement learning-based domain filter. Specifically, to model the account representation in the shared-account scenario, the basic recommender first clusters users' mixed behaviors as latent users, and then leverages an attention model over them to conduct user identification. To reduce the impact of irrelevant domain information, we formulate the domain filter as a hierarchical reinforcement learning task, where a high-level task is utilized to decide whether to revise the whole transferred sequence or not, and if it does, a low-level task is further performed to determine whether to remove each interaction within it or not. To evaluate the performance of our solution, we conduct extensive experiments on two real-world datasets, and the experimental results demonstrate the superiority of our RL-ISN method compared with the state-of-the-art recommendation methods.

preprint2022arXiv

Time Interval-enhanced Graph Neural Network for Shared-account Cross-domain Sequential Recommendation

Shared-account Cross-domain Sequential Recommendation (SCSR) task aims to recommend the next item via leveraging the mixed user behaviors in multiple domains. It is gaining immense research attention as more and more users tend to sign up on different platforms and share accounts with others to access domain-specific services. Existing works on SCSR mainly rely on mining sequential patterns via Recurrent Neural Network (RNN)-based models, which suffer from the following limitations: 1) RNN-based methods overwhelmingly target discovering sequential dependencies in single-user behaviors. They are not expressive enough to capture the relationships among multiple entities in SCSR. 2) All existing methods bridge two domains via knowledge transfer in the latent space, and ignore the explicit cross-domain graph structure. 3) None existing studies consider the time interval information among items, which is essential in the sequential recommendation for characterizing different items and learning discriminative representations for them. In this work, we propose a new graph-based solution, namely TiDA-GCN, to address the above challenges. Specifically, we first link users and items in each domain as a graph. Then, we devise a domain-aware graph convolution network to learn userspecific node representations. To fully account for users' domainspecific preferences on items, two effective attention mechanisms are further developed to selectively guide the message passing process. Moreover, to further enhance item- and account-level representation learning, we incorporate the time interval into the message passing, and design an account-aware self-attention module for learning items' interactive characteristics. Experiments demonstrate the superiority of our proposed method from various aspects.

preprint2014arXiv

Compact Model of Nanowire Tunneling FETs Including Phonon-Assisted Tunneling and Quantum Capacitance

A physics-based compact model for silicon gate-all-around (GAA) nanowire tunneling FETs (NW-tFETs) with good accuracy has been developed by considering Phonon-Assisted Tunneling (PAT) and transition from Quantum Capacitance Limit (QCL) to Classical Limit (CL) during the device-size scaling. The impact of PAT results in the broadening of a single electron-energy level to an energy band with density-of-states (DOS) distribution of Lorentzian shape. As a consequence, the tunneling probability at the edge of tunneling window no longer changes abruptly from zero to having a finite value. By adjusting the parameters in the Lorentzian function, an accurate fitting to the measured transfer characteristics in the subthreshold region is made possible. Besides, with an analytical formula to calculate the channel potential, the model is able to cover naturally the transition from QCL to CL regime when the device size is scaled. Furthermore, on-voltage is defined to facilitate the modeling and fitting processes. Comparisons with the experimental data demonstrate the model accuracy across all device operation regions and the flexibility in model parameter extraction is also shown.

preprint2002arXiv

The study of a flavor-changing neutral toppion production process $e^+e^-\to t\bar{c}Π_t^0$

We have studied a flavor-changing toppion production process $e^{+}e^{-}\to t\bar{c}Π^{0}_{t}$ in the topcolor-assisted technicolor(TC2) model. The studies show that, with high centre of mass energy in TESLA collider, the production cross section of $e^{+}e^{-}\to t\bar{c}Π^{0}_{t}$ is at the order of magnitude 0.1 fb in most parameter regions of TC2 model and a few tens events of toppion can be produced each year. The resonance effect can enhance the cross section to a few fb when toppion mass is small. With clean background, the toppion events can possibly be detected at TESLA collider. On the other hand, we find that there exists a narrow peak near $m_t-m_c$ in the toppion-charm invariant mass distribution which could be clearly detected. Therefore, such a toppion production process $e^{+}e^{-}\to t\bar{c}Π^{0}_{t}$ provides a unique chance to detect toppion events and test the TC2 model.