Researcher profile

Min Zhou

Min Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

CktGen: Automated Analog Circuit Design with Generative Artificial Intelligence

The automatic synthesis of analog circuits presents significant challenges. Most existing approaches formulate the problem as a single-objective optimization task, overlooking that design specifications for a given circuit type vary widely across applications. To address this, we introduce specification-conditioned analog circuit generation, a task that directly generates analog circuits based on target specifications. The motivation is to leverage existing well-designed circuits to improve automation in analog circuit design. Specifically, we propose CktGen, a simple yet effective variational autoencoder that maps discretized specifications and circuits into a joint latent space and reconstructs the circuit from that latent vector. Notably, as a single specification may correspond to multiple valid circuits, naively fusing specification information into the generative model does not capture these one-to-many relationships. To address this, we decouple the encoding of circuits and specifications and align their mapped latent space. Then, we employ contrastive training with a filter mask to maximize differences between encoded circuits and specifications. Furthermore, classifier guidance along with latent feature alignment promotes the clustering of circuits sharing the same specification, avoiding model collapse into trivial one-to-one mappings. By canonicalizing the latent space with respect to specifications, we can search for an optimal circuit that meets valid target specifications. We conduct comprehensive experiments on the open circuit benchmark and introduce metrics to evaluate cross-model consistency. Experimental results demonstrate that CktGen achieves substantial improvements over state-of-the-art methods.

preprint2025arXiv

ALF: Advertiser Large Foundation Model for Multi-Modal Advertiser Understanding

We present ALF (Advertiser Large Foundation model), a multi-modal transformer architecture for understanding advertiser behavior and intent across text, image, video, and structured data modalities. Through contrastive learning and multi-task optimization, ALF creates unified advertiser representations that capture both content and behavioral patterns. Our model achieves state-of-the-art performance on critical tasks including fraud detection, policy violation identification, and advertiser similarity matching. In production deployment, ALF demonstrates significant real-world impact by delivering simultaneous gains in both precision and recall, for instance boosting recall by over 40 percentage points on one critical policy and increasing precision to 99.8% on another. The architecture's effectiveness stems from its novel combination of multi-modal transformations, inter-sample attention mechanism, spectrally normalized projections, and calibrated probabilistic outputs.

preprint2022arXiv

Boosting Factorization Machines via Saliency-Guided Mixup

Factorization machines (FMs) are widely used in recommender systems due to their adaptability and ability to learn from sparse data. However, for the ubiquitous non-interactive features in sparse data, existing FMs can only estimate the parameters corresponding to these features via the inner product of their embeddings. Undeniably, they cannot learn the direct interactions of these features, which limits the model's expressive power. To this end, we first present MixFM, inspired by Mixup, to generate auxiliary training data to boost FMs. Unlike existing augmentation strategies that require labor costs and expertise to collect additional information such as position and fields, these extra data generated by MixFM only by the convex combination of the raw ones without any professional knowledge support. More importantly, if the parent samples to be mixed have non-interactive features, MixFM will establish their direct interactions. Second, considering that MixFM may generate redundant or even detrimental instances, we further put forward a novel Factorization Machine powered by Saliency-guided Mixup (denoted as SMFM). Guided by the customized saliency, SMFM can generate more informative neighbor data. Through theoretical analysis, we prove that the proposed methods minimize the upper bound of the generalization error, which hold a beneficial effect on enhancing FMs. Significantly, we give the first generalization bound of FM, implying the generalization requires more data and a smaller embedding size under the sufficient representation capability. Finally, extensive experiments on five datasets confirm that our approaches are superior to baselines. Besides, the results show that "poisoning" mixed data is likewise beneficial to the FM variants.

preprint2022arXiv

BSAL: A Framework of Bi-component Structure and Attribute Learning for Link Prediction

Given the ubiquitous existence of graph-structured data, learning the representations of nodes for the downstream tasks ranging from node classification, link prediction to graph classification is of crucial importance. Regarding missing link inference of diverse networks, we revisit the link prediction techniques and identify the importance of both the structural and attribute information. However, the available techniques either heavily count on the network topology which is spurious in practice or cannot integrate graph topology and features properly. To bridge the gap, we propose a bicomponent structural and attribute learning framework (BSAL) that is designed to adaptively leverage information from topology and feature spaces. Specifically, BSAL constructs a semantic topology via the node attributes and then gets the embeddings regarding the semantic view, which provides a flexible and easy-to-implement solution to adaptively incorporate the information carried by the node attributes. Then the semantic embedding together with topology embedding is fused together using an attention mechanism for the final prediction. Extensive experiments show the superior performance of our proposal and it significantly outperforms baselines on diverse research benchmarks.

preprint2022arXiv

Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs

In this paper, we study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images. We note that image compositions, which contain not only global semantics but also spatial information, would largely affect layout results. Hence, we propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images. To obtain training images from images that already contain manually designed graphic layout data, previous work suggests masking design elements (e.g., texts and embellishments) as model inputs, which inevitably leaves hint of the ground truth. We study the misalignment between the training inputs (with hint masks) and test inputs (without masks), and design a novel domain alignment module (DAM) to narrow this gap. For training, we built a large-scale layout dataset which consists of 60,548 advertising posters with annotated layout information. To evaluate the generated layouts, we propose three novel metrics according to aesthetic intuitions. Through both quantitative and qualitative evaluations, we demonstrate that the proposed model can synthesize high-quality graphic layouts according to image compositions.

preprint2022arXiv

Discovering Representative Attribute-stars via Minimum Description Length

Graphs are a popular data type found in many domains. Numerous techniques have been proposed to find interesting patterns in graphs to help understand the data and support decision-making. However, there are generally two limitations that hinder their practical use: (1) they have multiple parameters that are hard to set but greatly influence results, (2) and they generally focus on identifying complex subgraphs while ignoring relationships between attributes of nodes.Graphs are a popular data type found in many domains. Numerous techniques have been proposed to find interesting patterns in graphs to help understand the data and support decision-making. However, there are generally two limitations that hinder their practical use: (1) they have multiple parameters that are hard to set but greatly influence results, (2) and they generally focus on identifying complex subgraphs while ignoring relationships between attributes of nodes. To address these problems, we propose a parameter-free algorithm named CSPM (Compressing Star Pattern Miner) which identifies star-shaped patterns that indicate strong correlations among attributes via the concept of conditional entropy and the minimum description length principle. Experiments performed on several benchmark datasets show that CSPM reveals insightful and interpretable patterns and is efficient in runtime. Moreover, quantitative evaluations on two real-world applications show that CSPM has broad applications as it successfully boosts the accuracy of graph attribute completion models by up to 30.68\% and uncovers important patterns in telecommunication alarm data.

preprint2022arXiv

Enhancing Hyperbolic Graph Embeddings via Contrastive Learning

Recently, hyperbolic space has risen as a promising alternative for semi-supervised graph representation learning. Many efforts have been made to design hyperbolic versions of neural network operations. However, the inspiring geometric properties of this unique geometry have not been fully explored yet. The potency of graph models powered by the hyperbolic space is still largely underestimated. Besides, the rich information carried by abundant unlabelled samples is also not well utilized. Inspired by the recently active and emerging self-supervised learning, in this study, we attempt to enhance the representation power of hyperbolic graph models by drawing upon the advantages of contrastive learning. More specifically, we put forward a novel Hyperbolic Graph Contrastive Learning (HGCL) framework which learns node representations through multiple hyperbolic spaces to implicitly capture the hierarchical structure shared between different views. Then, we design a hyperbolic position consistency (HPC) constraint based on hyperbolic distance and the homophily assumption to make contrastive learning fit into hyperbolic space. Experimental results on multiple real-world datasets demonstrate the superiority of the proposed HGCL as it consistently outperforms competing methods by considerable margins for the node classification task.

preprint2022arXiv

Geometry Aligned Variational Transformer for Image-conditioned Layout Generation

Layout generation is a novel task in computer vision, which combines the challenges in both object localization and aesthetic appraisal, widely used in advertisements, posters, and slides design. An accurate and pleasant layout should consider both the intra-domain relationship within layout elements and the inter-domain relationship between layout elements and the image. However, most previous methods simply focus on image-content-agnostic layout generation, without leveraging the complex visual information from the image. To this end, we explore a novel paradigm entitled image-conditioned layout generation, which aims to add text overlays to an image in a semantically coherent manner. Specifically, we propose an Image-Conditioned Variational Transformer (ICVT) that autoregressively generates various layouts in an image. First, self-attention mechanism is adopted to model the contextual relationship within layout elements, while cross-attention mechanism is used to fuse the visual information of conditional images. Subsequently, we take them as building blocks of conditional variational autoencoder (CVAE), which demonstrates appealing diversity. Second, in order to alleviate the gap between layout elements domain and visual domain, we design a Geometry Alignment module, in which the geometric information of the image is aligned with the layout representation. In addition, we construct a large-scale advertisement poster layout designing dataset with delicate layout and saliency map annotations. Experimental results show that our model can adaptively generate layouts in the non-intrusive area of the image, resulting in a harmonious layout design.

preprint2022arXiv

HICF: Hyperbolic Informative Collaborative Filtering

Considering the prevalence of the power-law distribution in user-item networks, hyperbolic space has attracted considerable attention and achieved impressive performance in the recommender system recently. The advantage of hyperbolic recommendation lies in that its exponentially increasing capacity is well-suited to describe the power-law distributed user-item network whereas the Euclidean equivalent is deficient. Nonetheless, it remains unclear which kinds of items can be effectively recommended by the hyperbolic model and which cannot. To address the above concerns, we take the most basic recommendation technique, collaborative filtering, as a medium, to investigate the behaviors of hyperbolic and Euclidean recommendation models. The results reveal that (1) tail items get more emphasis in hyperbolic space than that in Euclidean space, but there is still ample room for improvement; (2) head items receive modest attention in hyperbolic space, which could be considerably improved; (3) and nonetheless, the hyperbolic models show more competitive performance than Euclidean models. Driven by the above observations, we design a novel learning method, named hyperbolic informative collaborative filtering (HICF), aiming to compensate for the recommendation effectiveness of the head item while at the same time improving the performance of the tail item. The main idea is to adapt the hyperbolic margin ranking learning, making its pull and push procedure geometric-aware, and providing informative guidance for the learning of both head and tail items. Extensive experiments back up the analytic findings and also show the effectiveness of the proposed method. The work is valuable for personalized recommendations since it reveals that the hyperbolic space facilitates modeling the tail item, which often represents user-customized preferences or new products.

preprint2022arXiv

HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric Regularization

In large-scale recommender systems, the user-item networks are generally scale-free or expand exponentially. The latent features (also known as embeddings) used to describe the user and item are determined by how well the embedding space fits the data distribution. Hyperbolic space offers a spacious room to learn embeddings with its negative curvature and metric properties, which can well fit data with tree-like structures. Recently, several hyperbolic approaches have been proposed to learn high-quality representations for the users and items. However, most of them concentrate on developing the hyperbolic similitude by designing appropriate projection operations, whereas many advantageous and exciting geometric properties of hyperbolic space have not been explicitly explored. For example, one of the most notable properties of hyperbolic space is that its capacity space increases exponentially with the radius, which indicates the area far away from the hyperbolic origin is much more embeddable. Regarding the geometric properties of hyperbolic space, we bring up a Hyperbolic Regularization powered Collaborative Filtering(HRCF) and design a geometric-aware hyperbolic regularizer. Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty, which is simple yet impressively effective. Through theoretical analysis, we further show that our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability. We conduct extensive empirical analysis, comparing our proposal against a large set of baselines on several public benchmarks. The empirical results show that our approach achieves highly competitive performance and surpasses both the leading Euclidean and hyperbolic baselines by considerable margins.

preprint2022arXiv

Learning-based AC-OPF Solvers on Realistic Network and Realistic Loads

Deep learning approaches for the Alternating Current-Optimal Power Flow (AC-OPF) problem are under active research in recent years. A common shortcoming in this area of research is the lack of a dataset that includes both a realistic power network topology and the corresponding realistic loads. To address this issue, we construct an AC-OPF formulation-ready dataset called TAS-97 that contains realistic network information and realistic bus loads from Tasmania's electricity network. We found that the realistic loads in Tasmania are correlated between buses and they show signs of an underlying multivariate normal distribution. Feasibility-optimized end-to-end deep neural network models are trained and tested on the constructed dataset. Trained on samples with bus loads generated from a fitted multivariate normal distribution, our learning-based AC-OPF solver achieves 0.13% cost optimality gap, 99.73% feasibility rate, and 38.62 times of speedup on realistic testing samples when compared to PYPOWER.

preprint2022arXiv

Towards Learning in Grey Spatiotemporal Systems: A Prophet to Non-consecutive Spatiotemporal Dynamics

Spatiotemporal forecasting is an imperative topic in data science due to its diverse and critical applications in smart cities. Existing works mostly perform consecutive predictions of following steps with observations completely and continuously obtained, where nearest observations can be exploited as key knowledge for instantaneous status estimation. However, the practical issues of early activity planning and sensor failures elicit a brand-new task, i.e., non-consecutive forecasting. In this paper, we define spatiotemporal learning systems with missing observation as Grey Spatiotemporal Systems (G2S) and propose a Factor-Decoupled learning framework for G2S (FDG2S), where the core idea is to hierarchically decouple multi-level factors and enable both flexible aggregations and disentangled uncertainty estimations. Firstly, to compensate for missing observations, a generic semantic-neighboring sequence sampling is devised, which selects representative sequences to capture both periodical regularity and instantaneous variations. Secondly, we turn the predictions of non-consecutive statuses into inferring statuses under expected combined exogenous factors. In particular, a factor-decoupled aggregation scheme is proposed to decouple factor-induced predictive intensity and region-wise proximity by two energy functions of conditional random field. To infer region-wise proximity under flexible factor-wise combinations and enable dynamic neighborhood aggregations, we further disentangle compounded influences of exogenous factors on region-wise proximity and learn to aggregate them. Given the inherent incompleteness and critical applications of G2S, a DisEntangled Uncertainty Quantification is put forward, to identify two types of uncertainty for reliability guarantees and model interpretations.

preprint2021arXiv

Developing Dipole-scheme Heterojunction Photocatalysts

The high recombination rate of photogenerated carriers is the bottleneck of photocatalysis, severely limiting the photocatalytic efficiency. Here, we develop a dipole-scheme (D-scheme for short) photocatalytic model and materials realization. The D-scheme heterojunction not only can effectively separate electrons and holes by a large polarization field, but also boosts photocatalytic redox reactions with large driving photovoltages and without any carrier loss. By means of first-principles and GW calculations, we propose a D-scheme heterojunction prototype with two real polar materials, PtSeTe/LiGaS2. This D-scheme photocatalyst exhibits a high capability of the photogenerated carrier separation and near-infrared light absorption. Moreover, our calculations of the Gibbs free energy imply a high ability of the hydrogen and oxygen evolution reaction by a large driving force. The proposed D-scheme photocatalytic model is generalized and paves a valuable route of significantly improving the photocatalytic efficiency.

preprint2020arXiv

Spatio-Temporal Hybrid Graph Convolutional Network for Traffic Forecasting in Telecommunication Networks

Telecommunication networks play a critical role in modern society. With the arrival of 5G networks, these systems are becoming even more diversified, integrated, and intelligent. Traffic forecasting is one of the key components in such a system, however, it is particularly challenging due to the complex spatial-temporal dependency. In this work, we consider this problem from the aspect of a cellular network and the interactions among its base stations. We thoroughly investigate the characteristics of cellular network traffic and shed light on the dependency complexities based on data collected from a densely populated metropolis area. Specifically, we observe that the traffic shows both dynamic and static spatial dependencies as well as diverse cyclic temporal patterns. To address these complexities, we propose an effective deep-learning-based approach, namely, Spatio-Temporal Hybrid Graph Convolutional Network (STHGCN). It employs GRUs to model the temporal dependency, while capturing the complex spatial dependency through a hybrid-GCN from three perspectives: spatial proximity, functional similarity, and recent trend similarity. We conduct extensive experiments on real-world traffic datasets collected from telecommunication networks. Our experimental results demonstrate the superiority of the proposed model in that it consistently outperforms both classical methods and state-of-the-art deep learning models, while being more robust and stable.