Researcher profile

Zhen Peng

Zhen Peng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism

Nearest Neighbor Search (NNS) has recently drawn a rapid increase of interest due to its core role in managing high-dimensional vector data in data science and AI applications. The interest is fueled by the success of neural embedding, where deep learning models transform unstructured data into semantically correlated feature vectors for data analysis, e.g., recommend popular items. Among several categories of methods for fast NNS, similarity graph is one of the most successful algorithmic trends. Several of the most popular and top-performing similarity graphs, such as NSG and HNSW, at their core employ best-first traversal along the underlying graph indices to search near neighbors. Maximizing the performance of the search is essential for many tasks, especially at the large-scale and high-recall regime. In this work, we provide an in-depth examination of the challenges of the state-of-the-art similarity search algorithms, revealing its challenges in leveraging multi-core processors to speed up the search efficiency. We also exploit whether similarity graph search is robust to deviation from maintaining strict order by allowing multiple walkers to simultaneously advance the search frontier. Based on our insights, we propose Speed-ANN, a parallel similarity search algorithm that exploits hidden intra-query parallelism and memory hierarchy that allows similarity search to take advantage of multiple CPU cores to significantly accelerate search speed while achieving high accuracy. We evaluate Speed-ANN on a wide range of datasets, ranging from million to billion data points, and show its shorter query latency than NSG and HNSW, respectively. Besides, with multicore support, we show that our approach offers faster search latency than highly-optimized GPU implementation and provides good scalability as the increase of the number of hardware resources (e.g., CPU cores) and graph sizes.

preprint2020arXiv

An ecological framework for the analysis of prebiotic chemical reaction networks and their dynamical behavior

It is becoming widely accepted that very early in the origin of life, even before the emergence of genetic encoding, reaction networks of diverse small chemicals might have manifested key properties of life, namely self-propagation and adaptive evolution. To explore this possibility, we formalize the dynamics of chemical reaction networks within the framework of chemical ecosystem ecology. To capture the idea that life-like chemical systems are maintained out of equilibrium by fluxes of energy-rich food chemicals, we model chemical ecosystems in well-mixed containers that are subject to constant dilution by a solution with a fixed concentration of food chemicals. Modelling all chemical reactions as fully reversible, we show that seeding an autocatalytic cycle (AC) with tiny amounts of one or more of its member chemicals results in logistic growth of all member chemicals in the cycle. This finding justifies drawing an instructive analogy between an AC and the population of a biological species. We extend this finding to show that pairs of ACs can show competitive, predator-prey, or mutualistic associations just like biological species. Furthermore, when there is stochasticity in the environment, particularly in the seeding of ACs, chemical ecosystems can show complex dynamics that can resemble evolution. The evolutionary character is especially clear when the network architecture results in ecological precedence (survival of the first), which makes the path of succession historically contingent on the order in which cycles are seeded. For all its simplicity, the framework developed here is helpful for visualizing how autocatalysis in prebiotic chemical reaction networks can yield life-like properties. Furthermore, chemical ecosystem ecology could provide a useful foundation for exploring the emergence of adaptive dynamics and the origins of polymer-based genetic systems.

preprint2020arXiv

Graph Representation Learning via Graphical Mutual Information Maximization

The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.