Researcher profile

Yao Cheng

Yao Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

FaceRefiner: High-Fidelity Facial Texture Refinement with Differentiable Rendering-based Style Transfer

Recent facial texture generation methods prefer to use deep networks to synthesize image content and then fill in the UV map, thus generating a compelling full texture from a single image. Nevertheless, the synthesized texture UV map usually comes from a space constructed by the training data or the 2D face generator, which limits the methods' generalization ability for in-the-wild input images. Consequently, their facial details, structures and identity may not be consistent with the input. In this paper, we address this issue by proposing a style transfer-based facial texture refinement method named FaceRefiner. FaceRefiner treats the 3D sampled texture as style and the output of a texture generation method as content. The photo-realistic style is then expected to be transferred from the style image to the content image. Different from current style transfer methods that only transfer high and middle level information to the result, our style transfer method integrates differentiable rendering to also transfer low level (or pixel level) information in the visible face regions. The main benefit of such multi-level information transfer is that, the details, structures and semantics in the input can thus be well preserved. The extensive experiments on Multi-PIE, CelebA and FFHQ datasets demonstrate that our refinement method can improve the texture quality and the face identity preserving ability, compared with state-of-the-arts.

preprint2026arXiv

TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation

Although significant progress has been made in the field of speech-driven 3D facial animation recently, the speech-driven animation of an indispensable facial component, eye gaze, has been overlooked by recent research. This is primarily due to the weak correlation between speech and eye gaze, as well as the scarcity of audio-gaze data, making it very challenging to generate 3D eye gaze motion from speech alone. In this paper, we propose a novel data-driven method which can generate diverse 3D eye gaze motions in harmony with the speech. To achieve this, we firstly construct an audio-gaze dataset that contains about 14 hours of audio-mesh sequences featuring high-quality eye gaze motion, head motion and facial motion simultaneously. The motion data is acquired by performing lightweight eye gaze fitting and face reconstruction on videos from existing audio-visual datasets. We then tailor a novel speech-to-motion translation framework in which the head motions and eye gaze motions are jointly generated from speech but are modeled in two separate latent spaces. This design stems from the physiological knowledge that the rotation range of eyeballs is less than that of head. Through mapping the speech embedding into the two latent spaces, the difficulty in modeling the weak correlation between speech and non-verbal motion is thus attenuated. Finally, our TalkingEyes, integrated with a speech-driven 3D facial motion generator, can synthesize eye gaze motion, eye blinks, head motion and facial motion collectively from speech. Extensive quantitative and qualitative evaluations demonstrate the superiority of the proposed method in generating diverse and natural 3D eye gaze motions from speech. The project page of this paper is: https://lkjkjoiuiu.github.io/TalkingEyes_Home/

preprint2026arXiv

UniISP: A Unified ISP Framework for Both Human and Machine Vision

Compared to RGB images, raw sensor data provides a richer representation of information, which is crucial for accurate recognition, particularly under challenging conditions such as low-light environments. The traditional Image Signal Processing (ISP) pipeline generates visually pleasing RGB images for human perception through a series of steps, but some of these operations may adversely impact the information integrity by introducing compression and loss. Furthermore, in computer vision tasks that directly utilize raw camera data, most existing methods integrate minimal ISP processing with downstream networks, yet the resulting images are often difficult to visualize or do not align with human aesthetic preferences. This paper proposes UniISP, a novel ISP framework designed to simultaneously meet the requirements of both human visual perception and computer vision applications. By incorporating a carefully designed Hybrid Attention Module (HAM) and employing supervised learning, the proposed method ensures that the generated images are visually appealing. Additionally, a Feature Adapter module is introduced to effectively propagate informative features from the ISP stage to subsequent downstream networks. Extensive experiments demonstrate that our approach achieves state-of-the-art performance across various scenarios and multiple datasets, proving its generalizability and effectiveness.

preprint2022arXiv

Finding Global Homophily in Graph Neural Networks When Meeting Heterophily

We investigate graph neural networks on graphs with heterophily. Some existing methods amplify a node's neighborhood with multi-hop neighbors to include more nodes with homophily. However, it is a significant challenge to set personalized neighborhood sizes for different nodes. Further, for other homophilous nodes excluded in the neighborhood, they are ignored for information aggregation. To address these problems, we propose two models GloGNN and GloGNN++, which generate a node's embedding by aggregating information from global nodes in the graph. In each layer, both models learn a coefficient matrix to capture the correlations between nodes, based on which neighborhood aggregation is performed. The coefficient matrix allows signed values and is derived from an optimization problem that has a closed-form solution. We further accelerate neighborhood aggregation and derive a linear time complexity. We theoretically explain the models' effectiveness by proving that both the coefficient matrix and the generated node embedding matrix have the desired grouping effect. We conduct extensive experiments to compare our models against 11 other competitors on 15 benchmark datasets in a wide range of domains, scales and graph heterophilies. Experimental results show that our methods achieve superior performance and are also very efficient.

preprint2022arXiv

On the Rankin-Selberg $L$-factors for ${\rm SO}_{5}\times{\rm GL}_2$

Let $π$ and $τ$ be a smooth generic representation of ${\rm SO}_5$ and ${\rm GL}_2$ respectively over a non-archimedean local field. Assume that $π$ is irreducible and $τ$ is irreducible or induced of Langlands' type. We show that the $L$- and $ε$-factors attached to $π\timesτ$ defined by the Rankin-Selberg integrals and the associated Weil-Deligne representation coincide. Similar compatibility results are also obtained for the local factors defined by the Novodvorsky's local zeta integrals attached to generic representations of ${\rm GSp}_4\times{\rm GL}_2$.

preprint2022arXiv

Rankin-Selberg integrals for ${\rm SO}_{2n+1}\times{\rm GL}_r$ attached to Newforms and Oldforms

The conjectural newform theory for generic representations of $p$-adic ${\rm SO}_{2n+1}$ was formulated by P.-Y. Tsai in her thesis in which Tsai also verified the conjecture when the representations are supercuspidal. The main purpose of this work is to compute the Rankin-Selberg integrals for ${\rm SO}_{2n+1}\times{\rm GL}_r$ with $1\le r\le n$ attached to newforms and also oldforms under the validity of the conjecture.

preprint2022arXiv

Supercloseness of the local discontinuous Galerkin method for a singularly perturbed convection-diffusion problem

A singularly perturbed convection-diffusion problem posed on the unit square in $\mathbb{R}^2$, whose solution has exponential boundary layers, is solved numerically using the local discontinuous Galerkin (LDG) method with piecewise polynomials of degree at most $k>0$ on three families of layer-adapted meshes: Shishkin-type, Bakhvalov-Shishkin-type and Bakhvalov-type.On Shishkin-type meshes this method is known to be no greater than $O(N^{-(k+1/2)})$ accurate in the energy norm induced by the bilinear form of the weak formulation, where $N$ mesh intervals are used in each coordinate direction. (Note: all bounds in this abstract are uniform in the singular perturbation parameter and neglect logarithmic factors that will appear in our detailed analysis.) A delicate argument is used in this paper to establish $O(N^{-(k+1)})$ energy-norm superconvergence on all three types of mesh for the difference between the LDG solution and a local Gauss-Radau projection of the exact solution into the finite element space. This supercloseness property implies a new $N^{-(k+1)}$ bound for the $L^2$ error between the LDG solution on each type of mesh and the exact solution of the problem; this bound is optimal (up to logarithmic factors). Numerical experiments confirm our theoretical results.

preprint2021arXiv

Local discontinuous Galerkin method on layer-adapted meshes for singularly perturbed reaction-diffusion problems in two dimensions

We analyse the local discontinuous Galerkin (LDG) method for two-dimensional singularly perturbed reaction-diffusion problems. A class of layer-adapted meshes, including Shishkin- and Bakhvalov-type meshes, is discussed within a general framework. Local projections and their approximation properties on anisotropic meshes are used to derive error estimates for energy and "balanced" norms. Here, the energy norm is naturally derived from the bilinear form of LDG formulation and the "balanced" norm is artifically introduced to capture the boundary layer contribution. We establish a uniform convergence of order $k$ for the LDG method using the balanced norm with the local weighted $L^2$ projection as well as an optimal convergence of order $k+1$ for the energy norm using the local Gauss-Radau projections. Numerical experiments are presented.

preprint2020arXiv

DeepMnemonic: Password Mnemonic Generation via Deep Attentive Encoder-Decoder Model

Strong passwords are fundamental to the security of password-based user authentication systems. In recent years, much effort has been made to evaluate password strength or to generate strong passwords. Unfortunately, the usability or memorability of the strong passwords has been largely neglected. In this paper, we aim to bridge the gap between strong password generation and the usability of strong passwords. We propose to automatically generate textual password mnemonics, i.e., natural language sentences, which are intended to help users better memorize passwords. We introduce \textit{DeepMnemonic}, a deep attentive encoder-decoder framework which takes a password as input and then automatically generates a mnemonic sentence for the password. We conduct extensive experiments to evaluate DeepMnemonic on the real-world data sets. The experimental results demonstrate that DeepMnemonic outperforms a well-known baseline for generating semantically meaningful mnemonic sentences. Moreover, the user study further validates that the generated mnemonic sentences by DeepMnemonic are useful in helping users memorize strong passwords.

preprint2020arXiv

Keyed Non-Parametric Hypothesis Tests

The recent popularity of machine learning calls for a deeper understanding of AI security. Amongst the numerous AI threats published so far, poisoning attacks currently attract considerable attention. In a poisoning attack the opponent partially tampers the dataset used for learning to mislead the classifier during the testing phase. This paper proposes a new protection strategy against poisoning attacks. The technique relies on a new primitive called keyed non-parametric hypothesis tests allowing to evaluate under adversarial conditions the training input's conformance with a previously learned distribution $\mathfrak{D}$. To do so we use a secret key $κ$ unknown to the opponent. Keyed non-parametric hypothesis tests differs from classical tests in that the secrecy of $κ$ prevents the opponent from misleading the keyed test into concluding that a (significantly) tampered dataset belongs to $\mathfrak{D}$.

preprint2020arXiv

Load-velocity-temperature relationship in frictional response of microscopic contacts

Frictional properties of interfaces with dynamic chemical bonds have been the subject of intensive experimental investigation and modeling, as it provides important insights into the molecular origin of the empirical rate and state laws, which have been highly successful in describing friction from nano to geophysical scales. Using previously developed theoretical approaches requires time-consuming simulations that are impractical for many realistic tribological systems. To solve this problem and set a framework for understanding microscopic mechanisms of friction at interfaces including multiple microscopic contacts, we developed an analytical approach for description of friction mediated by dynamical formation and rupture of microscopic interfacial contacts, which allows to calculate frictional properties on the time and length scales that are relevant to tribological experimental conditions. The model accounts for the presence of various types of contacts at the frictional interface and predicts novel dependencies of friction on sliding velocity, temperature, and normal load, which are amenable to experimental observations. Our model predicts the velocity-temperature scaling, which relies on the interplay between the effects of shear and temperature on the rupture of interfacial contacts. The proposed scaling can be used to extrapolate the simulation results to a range of very low sliding velocities used in nanoscale friction experiments, which is still unreachable by simulations. For interfaces including two types of interfacial contacts with distinct properties, our model predicts novel double-peaked dependencies of friction on temperature and velocity. Our work provides a promising avenue for the interpretation of the experimental data on friction at interfaces including microscopic contacts and opens new pathways for the rational control of the frictional response.

preprint2020arXiv

Special value formula for the twisted triple product $L$-function and an application to the restricted $L^2$-norm problem

We establish explicit Ichino's formulae for the central values of the triple product $L$-functions with emphasis on the calculations for the real place. The key ingredient for our computations is Proposition 6.8 which generalizes a result of Michel-Venkatesh. As an application we prove the optimal upper bound of a sum of restricted $L^2$-norms of the $L^2$-normalized newforms on certain quadratic extensions with prime level and bounded spectral parameter following the methods of Blomer.