Researcher profile

Xiao Dong

Xiao Dong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2025arXiv

WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction

In this paper, we present WonderHuman to reconstruct dynamic human avatars from a monocular video for high-fidelity novel view synthesis. Previous dynamic human avatar reconstruction methods typically require the input video to have full coverage of the observed human body. However, in daily practice, one typically has access to limited viewpoints, such as monocular front-view videos, making it a cumbersome task for previous methods to reconstruct the unseen parts of the human avatar. To tackle the issue, we present WonderHuman, which leverages 2D generative diffusion model priors to achieve high-quality, photorealistic reconstructions of dynamic human avatars from monocular videos, including accurate rendering of unseen body parts. Our approach introduces a Dual-Space Optimization technique, applying Score Distillation Sampling (SDS) in both canonical and observation spaces to ensure visual consistency and enhance realism in dynamic human reconstruction. Additionally, we present a View Selection strategy and Pose Feature Injection to enforce the consistency between SDS predictions and observed data, ensuring pose-dependent effects and higher fidelity in the reconstructed avatar. In the experiments, our method achieves SOTA performance in producing photorealistic renderings from the given monocular video, particularly for those challenging unseen parts. The project page and source code can be found at https://wyiguanw.github.io/WonderHuman/.

preprint2022arXiv

Electronegativity and chemical hardness of the elements under pressure

Abundant evidence has shown the emergence of exotic chemical phenomena under pressure, including the formation of unexpected compounds and strange crystal structures. In many cases, there is no convincing explanation for these phenomena and there are virtually no chemical rules or models capable of predicting or even rationalizing these phenomena. Here we calculate, as a function of pressure, two central chemical properties of atoms, electronegativity and chemical hardness, which can be seen as the first and second-order chemical potentials. Mulliken electronegativity, which equals minus the chemical potential of the electron relative to the vacuum, is appropriately modified - instead of taking the vacuum (impossible under high pressure), we take the homogeneous electron gas as reference. We find that for most elements, chemical hardness and electronegativity decrease with pressure, consistent with pressure-induced metallization. Furthermore, we discover that pressure-induced s-d orbital transfer makes Ni, Pd and Pt "pseudo-noble-gas" atoms with a closed d-shell configuration, and the elements preceding them (Fe and especially Co, Rh, Ir) electron acceptors, while the elements right after them (Cu, Ag, Zn, Cd, for example) become highly electropositive. We show the explicative and predictive power of our electronegativity and chemical hardness scales under pressure.

preprint2022arXiv

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval

Our goal in this research is to study a more realistic environment in which we can conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks to enable the evaluations on the price comparison and personalized recommendations. For both instance-level tasks, how to accurately pinpoint the product target mentioned in the visual-linguistic data and effectively decrease the influence of irrelevant contents is quite challenging. To address this, we exploit to train a more effective cross-modal pertaining model which is adaptively capable of incorporating key concept information from the multi-modal data, by using an entity graph whose node and edge respectively denote the entity and the similarity relation between entities. Specifically, a novel Entity-Graph Enhanced Cross-Modal Pretraining (EGE-CMP) model is proposed for instance-level commodity retrieval, that explicitly injects entity knowledge in both node-based and subgraph-based ways into the multi-modal networks via a self-supervised hybrid-stream transformer, which could reduce the confusion between different object contents, thereby effectively guiding the network to focus on entities with real semantic. Experimental results well verify the efficacy and generalizability of our EGE-CMP, outperforming several SOTA cross-modal baselines like CLIP, UNITER and CAPTURE.

preprint2022arXiv

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining

Despite the potential of multi-modal pre-training to learn highly discriminative feature representations from complementary data modalities, current progress is being slowed by the lack of large-scale modality-diverse datasets. By leveraging the natural suitability of E-commerce, where different modalities capture complementary semantic information, we contribute a large-scale multi-modal pre-training dataset M5Product. The dataset comprises 5 modalities (image, text, table, video, and audio), covers over 6,000 categories and 5,000 attributes, and is 500 larger than the largest publicly available dataset with a similar number of modalities. Furthermore, M5Product contains incomplete modality pairs and noise while also having a long-tailed distribution, resembling most real-world problems. We further propose Self-harmonized ContrAstive LEarning (SCALE), a novel pretraining framework that integrates the different modalities into a unified model through an adaptive feature fusion mechanism, where the importance of each modality is learned directly from the modality embeddings and impacts the inter-modality contrastive learning and masked tasks within a multi-modal transformer model. We evaluate the current multi-modal pre-training state-of-the-art approaches and benchmark their ability to learn from unlabeled data when faced with the large number of modalities in the M5Product dataset. We conduct extensive experiments on four downstream tasks and demonstrate the superiority of our SCALE model, providing insights into the importance of dataset scale and diversity.

preprint2022arXiv

Ultrahigh-Pressure Magnesium Hydrosilicates as Reservoirs of Water in Early Earth

The origin of water on the Earth is a long-standing mystery, requiring a comprehensive search for hydrous compounds, stable at conditions of the deep Earth and made of Earth-abundant elements. Previous studies usually focused on the current range of pressure-temperature conditions in the Earth's mantle and ignored a possible difference in the past, such as the stage of the core-mantle separation. Here, using ab initio evolutionary structure prediction, we find that only two magnesium hydrosilicate phases are stable at megabar pressures, $α$-Mg$_2$SiO$_5$H$_2$ and $β$-Mg$_2$SiO$_5$H$_2$, stable at 262-338 GPa and >338 GPa,respectively (all these pressures now lie within the Earth's iron core). Both are superionic conductors with quasi-one-dimensional proton diffusion at relevant conditions. In the first 30 million years of Earth's history, before the Earth's core was formed, these must have existed in the Earth, hosting much of Earth's water. As dense iron alloys segregated to form the Earth's core, Mg$_2$SiO$_5$H$_2$ phases decomposed and released water. Thus, now-extinct Mg$_2$SiO$_5$H$_2$ phases have likely contributed in a major way to the evolution of our planet.

preprint2022arXiv

Unusual phase transition of layer-stacked borophene under pressure

The 8-Pmmn borophene, a boron analogue of graphene, hosts tilted and anisotropic massless Dirac fermion quasiparticles owing to the presence of the distorted graphene-like sublattice. First-principles calculations show that the stacked 8-Pmmn borophene is transformed into the fused three-dimensional borophene under pressure, being accompanied by the partially bond-breaking and bond-reforming. Strikingly, the fused 8-Pmmn borophene inherits the Dirac band dispersion resulting in an unusual semimetal-semimetal transition. A simple tight-binding model derived from graphene qualitatively reveals the underlying physics due to the maximum preservation of graphene-like substructure after the phase transition, which contrasts greatly to the transformation of graphite into diamond associated with the semimetal-insulator transition.

preprint2022arXiv

Worst-Case Dynamic Power Distribution Network Noise Prediction Using Convolutional Neural Network

Worst-case dynamic PDN noise analysis is an essential step in PDN sign-off to ensure the performance and reliability of chips. However, with the growing PDN size and increasing scenarios to be validated, it becomes very time- and resource-consuming to conduct full-stack PDN simulation to check the worst-case noise for different test vectors. Recently, various works have proposed machine learning based methods for supply noise prediction, many of which still suffer from large training overhead, inefficiency, or non-scalability. Thus, this paper proposed an efficient and scalable framework for the worst-case dynamic PDN noise prediction. The framework first reduces the spatial and temporal redundancy in the PDN and input current vector, and then employs efficient feature extraction as well as a novel convolutional neural network architecture to predict the worst-case dynamic PDN noise. Experimental results show that the proposed framework consistently outperforms the commercial tool and the state-of-the-art machine learning method with only 0.63-1.02% mean relative error and 25-69$\times$ speedup.

preprint2021arXiv

A Unified Joint Maximum Mean Discrepancy for Domain Adaptation

Domain adaptation has received a lot of attention in recent years, and many algorithms have been proposed with impressive progress. However, it is still not fully explored concerning the joint probability distribution (P(X, Y)) distance for this problem, since its empirical estimation derived from the maximum mean discrepancy (joint maximum mean discrepancy, JMMD) will involve complex tensor-product operator that is hard to manipulate. To solve this issue, this paper theoretically derives a unified form of JMMD that is easy to optimize, and proves that the marginal, class conditional and weighted class conditional probability distribution distances are our special cases with different label kernels, among which the weighted class conditional one not only can realize feature alignment across domains in the category level, but also deal with imbalance dataset using the class prior probabilities. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence (discriminability) that benefits to classification, and it is sensitive to the label distribution shift when the label kernel is the weighted class conditional one. Therefore, we leverage Hilbert Schmidt independence criterion and propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift. Finally, we conduct extensive experiments on several cross-domain datasets to demonstrate the validity and effectiveness of the revealed theoretical results.

preprint2021arXiv

Negative linear compressibility and unusual dynamic behaviors of NaB3

First-principles calculations reveal that sodium boride (NaB3) undergoes a phase transition from a tetragonal P4/mbm phase to an orthorhombic Pbam phase at about 16 GPa, accompanied by counterintuitive lattice expansion along the crystallographic a-axis. This unusual compression behavior is identified as negative linear compressibility (NLC), which is dominantly attributed to the symmetry-breaking of boron framework. Meanwhile, the P4/mbm and Pbam phases form superionic conductors after undergoing a peculiar swap state at high temperature. Specifically, under warm conditions the Na cation pairs exhibit a rare local exchange (or rotation) behavior, which may be originated from the asymmetric energy barriers of different diffusion paths. The study of NaB3 compound sheds new light on a material with the combination of NLC and ion transportation at extreme conditions.

preprint2020arXiv

Accelerating Deep Learning Inference with Cross-Layer Data Reuse on GPUs

Accelerating the deep learning inference is very important for real-time applications. In this paper, we propose a novel method to fuse the layers of convolutional neural networks (CNNs) on Graphics Processing Units (GPUs), which applies data reuse analysis and access optimization in different levels of the memory hierarchy. To achieve the balance between computation and memory access, we explore the fusion opportunities in the CNN computation graph and propose three fusion modes of convolutional neural networks: straight, merge and split. Then, an approach for generating efficient fused code is designed, which goes deeper in multi-level memory usage for cross-layer data reuse. The effectiveness of our method is evaluated with the network layers from state-of-the-art CNNs on two different GPU platforms, NVIDIA TITAN Xp and Tesla P4. The experiments show that the average speedup is 2.02x on representative structures of CNNs, and 1.57x on end-to-end inference of SqueezeNet.

preprint2020arXiv

Helium Induced Nitrogen Salt at High Pressure

The energy landscape of helium-nitrogen mixtures is explored by ab initio evolutionary searches, which predicted several stable helium-nitrogen compounds in the pressure range from 25 to 100 GPa. In particular, the monoclinic structure of HeN$_{22}$ consists of neutral He atoms, partially ionic dimers N$_{2}$$^{δ-}$, and lantern-like cages N$_{20}$$^{δ+}$. The presence of helium not only greatly enhances structural diversity of nitrogen solids, but also tremendously lowers the formation pressure of nitrogen salt. The unique nitrogen framework of (HeN$_{20}$)$^{δ+}$N$_{2}$$^{δ-}$ may be quenchable to ambient pressure even after removing helium. The estimated energy density of N$_{20}$$^{δ+}$N$_{2}$$^{δ-}$ (10.44 kJ/g) is $\sim$2.4 times larger than that of trinitrotoluene (TNT), indicating a very promising high-energy-density material.

preprint2020arXiv

Theoretical study of the pressure-induced structure, phase transition, mechanical and electronic properties in the V-N system

Stable compounds in the V-N system are systematically searched and four new high-pressure phases are found, including C2/m-V$_9$N, Pbam-V$_5$N$_2$, Pnma-V$_2$N and I4/mcm-VN$_2$. V$_2$N undergoes a phase transition from $\varepsilon$-Fe$_2$N-type V$_2$N (P$\bar{3}$1m) to $ζ$-Fe$_2$N-type V$_2$N (Pbcn) at 10 GPa and to Fe$_2$C-type V$_2$N (Pnnm) at 59 GPa, then to Pnma-V$_2$N at 96 GPa. Low-temperature tetragonal VN is theoretically proved to belong to space group P$\bar{4}$2m. The estimated Vickers hardnesses and fracture toughness of WC-type VN are around 37 GPa and 4.3-6.1 MPa m$^{1/2}$, respectively. Al$_2$Cu-type VN$_2$ (I4/mcm) with a Vickers hardness of 25-27 GPa and fracture toughness of 3.6-6.6 MPa m$^{1/2}$ also shows excellent mechanical properties. Elastic properties of WC-type mononitrides of transition metals from IVB group (Ti, Zr and Hf), VB group (V, Nb and Ta) and VIB (Cr, Mo and W) are calculated and compared. Both the bond strength and structural configuration determine the mechanical properties of a material.

preprint2019arXiv

High-Temperature Superconductivity in the Ti--H System at High Pressures

Search for stable high-pressure compounds in the Ti--H system reveals the existence of titanium hydrides with new stoichiometries, including Ibam-Ti$_2$H$_5$, I4/m-Ti$_5$H$_{13}$, I$\bar{4}$-Ti$_5$H$_{14}$, Fddd-TiH$_4$, Immm-Ti$_2$H$_{13}$, P$\bar{1}$-TiH$_{12}$, and C2/m-TiH$_{22}$. Our calculations predict I4/mmm $\rightarrow$ R$\bar{3}$m and I4/mmm $\rightarrow$ Cmma transitions in TiH and TiH$_2$, respectively. Phonons and the electron--phonon coupling of all searched titanium hydrides are analyzed at high pressure. It is found that Immm-Ti$_2$H$_{13}$ rather than the highest hydrogen content C2/m-TiH$_{22}$, exhibits the highest superconducting critical temperature T$_{c}$. The estimated T$_{c}$ of Immm-Ti$_2$H$_{13}$ and C2/m-TiH$_{22}$ are respectively 127.4--149.4 K ($μ^{*}$=0.1-0.15) at 350 GPa and 91.3--110.2 K at 250 GPa by numerically solving the Eliashberg equations. One of the effects of pressure on T$_{c}$ can be attributed to the softening and hardening of phonons with increasing pressure.