Researcher profile

Hong Zhu

Hong Zhu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

On the complexity of proximal gradient and proximal gradient-Newton-CG methods for \(\ell_1\)-regularized Optimization

In this paper, we propose two second-order methods for solving the \(\ell_1\)-regularized composite optimization problem, which are developed based on two distinct definitions of approximate second-order stationary points. We introduce a hybrid proximal gradient and negative curvature method, as well as an adaptive hybrid proximal gradient-Newton-CG method with negative curvature directions, to find a strong* approximate second-order stationary point and a weak approximate second-order stationary point for \(\ell_1\)-regularized optimization problems, respectively. Comprehensive analyses are provided regarding the iteration complexity, computational complexity, and the local superlinear convergence rates of the first phases of these two methods under specific error bound conditions. We demonstrate that the proximal gradient-Newton-CG method achieves the best-known iteration complexity for attaining the proposed weak approximate second-order stationary point, which is consistent with the results for finding an approximate second-order stationary point in unconstrained optimization. Through a toy example, we show that our proposed methods can effectively escape the first-order approximate solution. Numerical experiments implemented on the \(\ell_1\)-regularized Student's t-regression problem validate the effectiveness of both methods.

preprint2026arXiv

RelayGR: Scaling Long-Sequence Generative Recommendation via Cross-Stage Relay-Race Inference

Real-time recommender systems execute multi-stage cascades (retrieval, pre-processing, fine-grained ranking) under strict tail-latency SLOs, leaving only tens of milliseconds for ranking. Generative recommendation (GR) models can improve quality by consuming long user-behavior sequences, but in production their online sequence length is tightly capped by the ranking-stage P99 budget. We observe that the majority of GR tokens encode user behaviors that are independent of the item candidates, suggesting an opportunity to pre-infer a user-behavior prefix once and reuse it during ranking rather than recomputing it on the critical path. Realizing this idea at industrial scale is non-trivial: the prefix cache must survive across multiple pipeline stages before the final ranking instance is determined, the user population implies cache footprints far beyond a single device, and indiscriminate pre-inference would overload shared resources under high QPS. We present RelayGR, a production system that enables in-HBM relay-race inference for GR. RelayGR selectively pre-infers long-term user prefixes, keeps their KV caches resident in HBM over the request lifecycle, and ensures the subsequent ranking can consume them without remote fetches. RelayGR combines three techniques: 1) a sequence-aware trigger that admits only at-risk requests under a bounded cache footprint and pre-inference load, 2) an affinity-aware router that co-locates cache production and consumption by routing both the auxiliary pre-infer signal and the ranking request to the same instance, and 3) a memory-aware expander that uses server-local DRAM to capture short-term cross-request reuse while avoiding redundant reloads. We implement RelayGR on Huawei Ascend NPUs and evaluate it with real queries. Under a fixed P99 SLO, RelayGR supports up to 1.5$\times$ longer sequences and improves SLO-compliant throughput by up to 3.6$\times$.

preprint2022arXiv

CoDo: Contrastive Learning with Downstream Background Invariance for Detection

The prior self-supervised learning researches mainly select image-level instance discrimination as pretext task. It achieves a fantastic classification performance that is comparable to supervised learning methods. However, with degraded transfer performance on downstream tasks such as object detection. To bridge the performance gap, we propose a novel object-level self-supervised learning method, called Contrastive learning with Downstream background invariance (CoDo). The pretext task is converted to focus on instance location modeling for various backgrounds, especially for downstream datasets. The ability of background invariance is considered vital for object detection. Firstly, a data augmentation strategy is proposed to paste the instances onto background images, and then jitter the bounding box to involve background information. Secondly, we implement architecture alignment between our pretraining network and the mainstream detection pipelines. Thirdly, hierarchical and multi views contrastive learning is designed to improve performance of visual representation learning. Experiments on MSCOCO demonstrate that the proposed CoDo with common backbones, ResNet50-FPN, yields strong transfer learning results for object detection.

preprint2022arXiv

Discovering Boundary Values of Feature-based Machine Learning Classifiers through Exploratory Datamorphic Testing

Testing has been widely recognised as difficult for AI applications. This paper proposes a set of testing strategies for testing machine learning applications in the framework of the datamorphism testing methodology. In these strategies, testing aims at exploring the data space of a classification or clustering application to discover the boundaries between classes that the machine learning application defines. This enables the tester to understand precisely the behaviour and function of the software under test. In the paper, three variants of exploratory strategies are presented with the algorithms implemented in the automated datamorphic testing tool Morphy. The correctness of these algorithms are formally proved. Their capability and cost of discovering borders between classes are evaluated via a set of controlled experiments with manually designed subjects and a set of case studies with real machine learning models.

preprint2022arXiv

High-throughput calculations combining machine learning to investigate the corrosion properties of binary Mg alloys

Magnesium (Mg) alloys have shown great prospects as both structural and biomedical materials, while poor corrosion resistance limits their further application. In this work, to avoid the time-consuming and laborious experiment trial, a high-throughput computational strategy based on first-principles calculations is designed for screening corrosion-resistant binary Mg alloy with intermetallics, from both the thermodynamic and kinetic perspectives. The stable binary Mg intermetallics with low equilibrium potential difference with respect to the Mg matrix are firstly identified. Then, the hydrogen adsorption energies on the surfaces of these Mg intermetallics are calculated, and the corrosion exchange current density is further calculated by a hydrogen evolution reaction (HER) kinetic model. Several intermetallics, e.g. Y3Mg, Y2Mg and La5Mg, are identified to be promising intermetallics which might effectively hinder the cathodic HER. Furthermore, machine learning (ML) models are developed to predict Mg intermetallics with proper hydrogen adsorption energy employing work function (W_f) and weighted first ionization energy (WFIE). The generalization of the ML models is tested on five new binary Mg intermetallics with the average root mean square error (RMSE) of 0.11 eV. This study not only predicts some promising binary Mg intermetallics which may suppress the galvanic corrosion, but also provides a high-throughput screening strategy and ML models for the design of corrosion-resistant alloy, which can be extended to ternary Mg alloys or other alloy systems.

preprint2021arXiv

Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation

Sequential recommender systems aim to model users' evolving interests from their historical behaviors, and hence make customized time-relevant recommendations. Compared with traditional models, deep learning approaches such as CNN and RNN have achieved remarkable advancements in recommendation tasks. Recently, the BERT framework also emerges as a promising method, benefited from its self-attention mechanism in processing sequential data. However, one limitation of the original BERT framework is that it only considers one input source of the natural language tokens. It is still an open question to leverage various types of information under the BERT framework. Nonetheless, it is intuitively appealing to utilize other side information, such as item category or tag, for more comprehensive depictions and better recommendations. In our pilot experiments, we found naive approaches, which directly fuse types of side information into the item embeddings, usually bring very little or even negative effects. Therefore, in this paper, we propose the NOninVasive self-attention mechanism (NOVA) to leverage side information effectively under the BERT framework. NOVA makes use of side information to generate better attention distribution, rather than directly altering the item embedding, which may cause information overwhelming. We validate the NOVA-BERT model on both public and commercial datasets, and our method can stably outperform the state-of-the-art models with negligible computational overheads.

preprint2020arXiv

Adversarial Model for Rotated Indoor Scenes Planning

In this paper, we propose an adversarial model for producing furniture layout for interior scene synthesis when the interior room is rotated. The proposed model combines a conditional adversarial network, a rotation module, a mode module, and a rotation discriminator module. As compared with the prior work on scene synthesis, our proposed three modules enhance the ability of auto-layout generation and reduce the mode collapse during the rotation of the interior room. We conduct our experiments on a proposed real-world interior layout dataset that contains 14400 designs from the professional designers. Our numerical results demonstrate that the proposed model yields higher-quality layouts for four types of rooms, including the bedroom, the bathroom, the study room, and the tatami room.

preprint2020arXiv

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

preprint2020arXiv

Structural Plan of Indoor Scenes with Personalized Preferences

In this paper, we propose an assistive model that supports professional interior designers to produce industrial interior decoration solutions and to meet the personalized preferences of the property owners. The proposed model is able to automatically produce the layout of objects of a particular indoor scene according to property owners' preferences. In particular, the model consists of the extraction of abstract graph, conditional graph generation, and conditional scene instantiation. We provide an interior layout dataset that contains real-world 11000 designs from professional designers. Our numerical results on the dataset demonstrate the effectiveness of the proposed model compared with the state-of-art methods.

preprint2019arXiv

Anion charge-lattice volume dependent Li ion migration in compounds with the face-centered cubic anion frameworks

The proper design principles are essential for the efficient development of superionic conductors. However, the existing design principles are mainly proposed from the perspective of crystal structures. In this work, the face-centered cubic (fcc) anion frameworks were creatively constructed to study the effects of anion charge and lattice volume on the stability of lithium ion occupation and lithium ion migration. Both the large negative anion charges and large lattice volumes would increase the relative stabilities of lithium-anion tetrahedron, and make Li ions prefer to occupy the tetrahedral sites. For a tetrahedral Li ion migration to its adjacent tetrahedral site through an octahedral transition state, the smaller the negative anion charge is, the lower the lithium ion migration barrier will be. While for an octahedral Li ion migration to its adjacent octahedral site through a tetrahedral transition state, the larger negative anion charge is, the lower the lithium ion migration barrier will be. New design principles for developing superionic conductors with the fcc anion framework were proposed. Low Li ion migration barriers would be achieved by adjusting the non-lithium elements within the same crystal structure framework to obtain the desired electronegativity difference between the anion element and non-lithium cation element.