Researcher profile

Xiong Zhang

Xiong Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Convolutional Embedding Makes Hierarchical Vision Transformer Stronger

Vision Transformers (ViTs) have recently dominated a range of computer vision tasks, yet it suffers from low training data efficiency and inferior local semantic representation capability without appropriate inductive bias. Convolutional neural networks (CNNs) inherently capture regional-aware semantics, inspiring researchers to introduce CNNs back into the architecture of the ViTs to provide desirable inductive bias for ViTs. However, is the locality achieved by the micro-level CNNs embedded in ViTs good enough? In this paper, we investigate the problem by profoundly exploring how the macro architecture of the hybrid CNNs/ViTs enhances the performances of hierarchical ViTs. Particularly, we study the role of token embedding layers, alias convolutional embedding (CE), and systemically reveal how CE injects desirable inductive bias in ViTs. Besides, we apply the optimal CE configuration to 4 recently released state-of-the-art ViTs, effectively boosting the corresponding performances. Finally, a family of efficient hybrid CNNs/ViTs, dubbed CETNets, are released, which may serve as generic vision backbones. Specifically, CETNets achieve 84.9% Top-1 accuracy on ImageNet-1K (training from scratch), 48.6% box mAP on the COCO benchmark, and 51.6% mIoU on the ADE20K, substantially improving the performances of the corresponding state-of-the-art baselines.

preprint2022arXiv

MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image

In this work, we propose a framework for single-view hand mesh reconstruction, which can simultaneously achieve high reconstruction accuracy, fast inference speed, and temporal coherence. Specifically, for 2D encoding, we propose lightweight yet effective stacked structures. Regarding 3D decoding, we provide an efficient graph operator, namely depth-separable spiral convolution. Moreover, we present a novel feature lifting module for bridging the gap between 2D and 3D representations. This module begins with a map-based position regression (MapReg) block to integrate the merits of both heatmap encoding and position regression paradigms for improved 2D accuracy and temporal coherence. Furthermore, MapReg is followed by pose pooling and pose-to-vertex lifting approaches, which transform 2D pose encodings to semantic features of 3D vertices. Overall, our hand reconstruction framework, called MobRecon, comprises affordable computational costs and miniature model size, which reaches a high inference speed of 83FPS on Apple A14 CPU. Extensive experiments on popular datasets such as FreiHAND, RHD, and HO3Dv2 demonstrate that our MobRecon achieves superior performance on reconstruction accuracy and temporal coherence. Our code is publicly available at https://github.com/SeanChenxy/HandMesh.

preprint2021arXiv

Detection of a possible high-confidence radio quasi-periodic oscillation in the BL Lac PKS J2134-0153

We have searched quasi-periodic oscillations (QPOs) for BL Lac PKS J2134-0153 in the 15 GHz radio light curve announced by the Owens Valley Radio Observatory 40-m telescope during the period from 2008-01-05 to 2019-05-18, utilizing the Lomb-Scargle periodogram (LSP) and the weighted wavelet Z-transform (WWZ) techniques. This is the first time that to search for periodic radio signal in BL Lac PKS J2134-0153 by these two methods. These two methods consistently reveal a QPO of 4.69 $\pm$ 0.14 years (>5 $σ$ confidence level). We discuss possible causes for this QPO, and we expected that the binary black holes scenario, where the QPO is caused by the precession of the binary black holes, is the most likely explanation. BL Lac PKS J2134-0153 thus could be a good binary black hole candidate. In the binary black holes scenario, the distance between the primary black hole and the secondary black hole is 1.83$\times$10$^{16}$ cm.

preprint2020arXiv

Asymmetrical Vertical Federated Learning

Federated learning is a distributed machine learning method that aims to preserve the privacy of sample features and labels. In a federated learning system, ID-based sample alignment approaches are usually applied with few efforts made on the protection of ID privacy. In real-life applications, however, the confidentiality of sample IDs, which are the strongest row identifiers, is also drawing much attention from many participants. To relax their privacy concerns about ID privacy, this paper formally proposes the notion of asymmetrical vertical federated learning and illustrates the way to protect sample IDs. The standard private set intersection protocol is adapted to achieve the asymmetrical ID alignment phase in an asymmetrical vertical federated learning system. Correspondingly, a Pohlig-Hellman realization of the adapted protocol is provided. This paper also presents a genuine with dummy approach to achieving asymmetrical federated model training. To illustrate its application, a federated logistic regression algorithm is provided as an example. Experiments are also made for validating the feasibility of this approach.

preprint2020arXiv

Distributed Privacy Preserving Iterative Summation Protocols

In this paper, we study the problem of summation evaluation of secrets. The secrets are distributed over a network of nodes that form a ring graph. Privacy-preserving iterative protocols for computing the sum of the secrets are proposed, which are compatible with node join and leave situations. Theoretic bounds are derived regarding the utility and accuracy, and the proposed protocols are shown to comply with differential privacy requirements. Based on utility, accuracy and privacy, we also provide guidance on appropriate selections of random noise parameters. Additionally, a few numerical examples that demonstrate their effectiveness are provided.

preprint2020arXiv

Real-time Earthquake Early Warning with Deep Learning: Application to the 2016 Central Apennines, Italy Earthquake Sequence

Earthquake early warning systems are required to report earthquake locations and magnitudes as quickly as possible before the damaging S wave arrival to mitigate seismic hazards. Deep learning techniques provide potential for extracting earthquake source information from full seismic waveforms instead of seismic phase picks. We developed a novel deep learning earthquake early warning system that utilizes fully convolutional networks to simultaneously detect earthquakes and estimate their source parameters from continuous seismic waveform streams. The system determines earthquake location and magnitude as soon as one station receives earthquake signals and evolutionarily improves the solutions by receiving continuous data. We apply the system to the 2016 Mw 6.0 earthquake in Central Apennines, Italy and its subsequent sequence. Earthquake locations and magnitudes can be reliably determined as early as four seconds after the earliest P phase, with mean error ranges of 6.8-3.7 km and 0.31-0.23, respectively.

preprint2020arXiv

Teddy: A System for Interactive Review Analysis

Reviews are integral to e-commerce services and products. They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services. Today, data scientists analyze reviews by developing rules and models to extract, aggregate, and understand information embedded in the review text. However, working with thousands of reviews, which are typically noisy incomplete text, can be daunting without proper tools. Here we first contribute results from an interview study that we conducted with fifteen data scientists who work with review text, providing insights into their practices and challenges. Results suggest data scientists need interactive systems for many review analysis tasks. In response we introduce Teddy, an interactive system that enables data scientists to quickly obtain insights from reviews and improve their extraction and modeling pipelines.

preprint2018arXiv

Locating earthquakes with a network of seismic stations via a deep learning method

The accurate and automated determination of earthquake locations is still a challenging endeavor. However, such information is critical for monitoring seismic activity and assessing potential hazards in real time. Recently, a convolutional neural network was applied to detect earthquakes from single-station waveforms and approximately map events across several large surface areas. In this study, we locate 194 earthquakes induced during oil and gas operations in Oklahoma, USA, within an error range of approximately 4.9 km on average to the epicenter and 1.0 km to the depth in catalogs with data from 30 network stations by applying the fully convolutional network. The network is trained by 1,013 historic events, and the output is a 3D volume of the event location probability in the Earth. The trained system requires approximately one hundredth of a second to locate an event without the need for any velocity model or human interference.