Source author record

Sadman Sadeed Omee

Sadman Sadeed Omee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci Machine Learning

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

In context learning Foundation models for Materials Property Prediction with Small datasets

Foundation models (FMs) have recently shown remarkable in-context learning (ICL) capabilities across diverse scientific domains. In this work, we introduce a unified in-context learning foundation model (ICL-FM) framework for materials property prediction that integrates both composition-based and structure-aware representations. The proposed approach couples the pretrained TabPFN transformer with graph neural network (GNN)-derived embeddings and our novel MagpieEX descriptors. MagpieEX augments traditional features with cation-anion interaction data to explicitly measure bond ionicity and charge-transfer asymmetry, capturing interatomic bonding characteristics that influence vibrational and thermal transport properties. Comprehensive experiments on the MatBench benchmark suite and a standalone lattice thermal conductivity (LTC) dataset demonstrate that ICL-FM achieves competitive or superior performance to state-of-the-art (SOTA) models with significantly reduced training costs. Remarkably, the training-free ICL-FM outperformed sophisticated SOTA GNN models in five out of six representative composition-based tasks, including a significant 9.93\% improvement in phonon frequency prediction. On the LTC dataset, the FM effectively models complex phenomena such as phonon-phonon scattering and atomic mass contrast. t-SNE analysis reveals that the FM acts as a physics-aware feature refiner, transforming raw, disjoint feature clusters into continuous manifolds with gradual property transitions. This restructured latent space enhances interpolative prediction accuracy while aligning learned representations with underlying physical laws. This study establishes ICL-FM as a generalizable, data-efficient paradigm for materials informatics.

preprint2022arXiv

DeepXRD, a Deep Learning Model for Predicting of XRD spectrum from Materials Composition

One of the long-standing problems in materials science is how to predict a material's structure and then its properties given only its composition. Experimental characterization of crystal structures has been widely used for structure determination, which is however too expensive for high-throughput screening. At the same time, directly predicting crystal structures from compositions remains a challenging unsolved problem. Herein we propose a deep learning algorithm for predicting the XRD spectrum given only the composition of a material, which can then be used to infer key structural features for downstream structural analysis such as crystal system or space group classification or crystal lattice parameter determination or materials property predictions. Benchmark studies on two datasets show that our DeepXRD algorithm can achieve good performance for XRD prediction as evaluated over our test sets. It can thus be used in high-throughput screening in the huge materials composition space for new materials discovery.

preprint2022arXiv

Materials Transformers Language Models for Generative Materials Design: a benchmark study

Pre-trained transformer language models on large unlabeled corpus have produced state-of-the-art results in natural language processing, organic molecule design, and protein sequence generation. However, no such models have been applied to learn the composition patterns of inorganic materials. Here we train a series of seven modern transformer language models (GPT, GPT-2, GPT-Neo, GPT-J, BLMM, BART, and RoBERTa) using the expanded formulas from material deposited in the ICSD, OQMD, and Materials Projects databases. Six different datasets with/out non-charge-neutral or balanced electronegativity samples are used to benchmark the performances and uncover the generation biases of modern transformer models for the generative design of materials compositions. Our extensive experiments showed that the causal language models based materials transformers can generate chemically valid materials compositions with as high as 97.54\% to be charge neutral and 91.40\% to be electronegativity balanced, which has more than 6 times higher enrichment compared to a baseline pseudo-random sampling algorithm. These models also demonstrate high novelty and their potential in new materials discovery has been proved by their capability to recover the leave-out materials. We also find that the properties of the generated samples can be tailored by training the models with selected training sets such as high-bandgap materials. Our experiments also showed that different models each have their own preference in terms of the properties of the generated samples and their running time complexity varies a lot. We have applied our materials transformer models to discover a set of new materials as validated using DFT calculations.

Sadman Sadeed Omee

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

In context learning Foundation models for Materials Property Prediction with Small datasets

DeepXRD, a Deep Learning Model for Predicting of XRD spectrum from Materials Composition

Materials Transformers Language Models for Generative Materials Design: a benchmark study