Researcher profile

Nihang Fu

Nihang Fu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

DeepXRD, a Deep Learning Model for Predicting of XRD spectrum from Materials Composition

One of the long-standing problems in materials science is how to predict a material's structure and then its properties given only its composition. Experimental characterization of crystal structures has been widely used for structure determination, which is however too expensive for high-throughput screening. At the same time, directly predicting crystal structures from compositions remains a challenging unsolved problem. Herein we propose a deep learning algorithm for predicting the XRD spectrum given only the composition of a material, which can then be used to infer key structural features for downstream structural analysis such as crystal system or space group classification or crystal lattice parameter determination or materials property predictions. Benchmark studies on two datasets show that our DeepXRD algorithm can achieve good performance for XRD prediction as evaluated over our test sets. It can thus be used in high-throughput screening in the huge materials composition space for new materials discovery.

preprint2022arXiv

Designing novel protein structures using sequence generator and AlphaFold2

Protein structures and functions are determined by a contiguous arrangement of amino acid sequences. Designing novel protein sequences and structures with desired geometry and functions is a complex task with large state spaces. Here we develop a novel protein design pipeline consisting of two deep learning algorithms, ProteinSolver and AlphaFold2. ProteinSolver is a deep graph neural network that generates amino acid sequences such that the forces between interacting amino acids are favorable and compatible with the fold while AlphaFold2 is a deep learning algorithm that predicts the protein structures from protein sequences. We present forty de novo designed binding sites of the PTP1B and P53 proteins with high precision, out of which thirty proteins are novel. Using ProteinSolver and AlphaFold2 in conjunction, we can trim the exploration of the large protein conformation space, thus expanding the ability to find novel and diverse de novo protein designs.

preprint2022arXiv

Materials Transformers Language Models for Generative Materials Design: a benchmark study

Pre-trained transformer language models on large unlabeled corpus have produced state-of-the-art results in natural language processing, organic molecule design, and protein sequence generation. However, no such models have been applied to learn the composition patterns of inorganic materials. Here we train a series of seven modern transformer language models (GPT, GPT-2, GPT-Neo, GPT-J, BLMM, BART, and RoBERTa) using the expanded formulas from material deposited in the ICSD, OQMD, and Materials Projects databases. Six different datasets with/out non-charge-neutral or balanced electronegativity samples are used to benchmark the performances and uncover the generation biases of modern transformer models for the generative design of materials compositions. Our extensive experiments showed that the causal language models based materials transformers can generate chemically valid materials compositions with as high as 97.54\% to be charge neutral and 91.40\% to be electronegativity balanced, which has more than 6 times higher enrichment compared to a baseline pseudo-random sampling algorithm. These models also demonstrate high novelty and their potential in new materials discovery has been proved by their capability to recover the leave-out materials. We also find that the properties of the generated samples can be tailored by training the models with selected training sets such as high-bandgap materials. Our experiments also showed that different models each have their own preference in terms of the properties of the generated samples and their running time complexity varies a lot. We have applied our materials transformer models to discover a set of new materials as validated using DFT calculations.

preprint2020arXiv

Simultaneously-Collected Multimodal Lying Pose Dataset: Towards In-Bed Human Pose Monitoring under Adverse Vision Conditions

Computer vision (CV) has achieved great success in interpreting semantic meanings from images, yet CV algorithms can be brittle for tasks with adverse vision conditions and the ones suffering from data/label pair limitation. One of this tasks is in-bed human pose estimation, which has significant values in many healthcare applications. In-bed pose monitoring in natural settings could involve complete darkness or full occlusion. Furthermore, the lack of publicly available in-bed pose datasets hinders the use of many successful pose estimation algorithms for this task. In this paper, we introduce our Simultaneously-collected multimodal Lying Pose (SLP) dataset, which includes in-bed pose images from 109 participants captured using multiple imaging modalities including RGB, long wave infrared, depth, and pressure map. We also present a physical hyper parameter tuning strategy for ground truth pose label generation under extreme conditions such as lights off and being fully covered by a sheet/blanket. SLP design is compatible with the mainstream human pose datasets, therefore, the state-of-the-art 2D pose estimation models can be trained effectively with SLP data with promising performance as high as 95% at PCKh@0.5 on a single modality. The pose estimation performance can be further improved by including additional modalities through collaboration.