Researcher profile

N. M. Anoop Krishnan

N. M. Anoop Krishnan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Agentic AI Scientists Are Not Built For Autonomous Scientific Discovery

A growing body of work pursues AI scientists capable of end-to-end autonomous scientific discovery. This position paper argues that although they already function as co-scientists, agentic AI scientists are not built for autonomous scientific discovery. We identify the following challenges in building and deploying autonomous AI scientists: (1) Problem selection is influenced by the McNamara fallacy; (2) Agents are built on large language models (LLMs) whose training corpora omit tacit procedural and failure knowledge of laboratory practice; (3) Preference optimisation during post-training compresses output diversity toward consensus; and (4) Most scientific benchmarks measure single-turn prediction accuracy and lack feedback from physical experiments back to the computational model. These challenges are not just questions of scale and scaffolding; they require revisiting fundamental design choices. To build truly autonomous AI scientists, we recommend the use of scientific simulations as verifiers for training, the design of persistent world models that represent the shifting objectives governing real investigations, the establishment of a centralized preregistration repository for all AI-generated hypotheses, and application driven by scientific need rather than tool affordance.

preprint2026arXiv

AMGenC: Generating Charge Balanced Amorphous Materials

Amorphous (disordered) materials are solids that have shown great potential in various domains, including energy storage, thermal management, and advanced materials. Unlike crystalline materials that can be described by unit cells containing a few to hundreds of atoms, amorphous materials require larger simulation cells with at least hundreds to thousands of atoms. To advance the design of amorphous materials with desired properties and facilitate the exploration of their vast design space, generative inverse design has emerged as a promising approach. It aims to directly output materials with properties closely aligned with the desired ones using probabilistic generative models conditioned on desired properties, which can be more resource efficient than the traditional trial-and-error approach. However, due to the inherent stochasticity of probabilistic generative models, when element assignments are unconstrained, a large portion of generated materials may be charge unbalanced, and no existing methods can effectively mitigate this limitation. In this work, we propose AMGenC, a new generative inverse design method for amorphous materials that can guarantee the generation of charge balanced samples, with minimal additional computational overhead and without sacrificing inverse design accuracy. AMGenC achieves this through an element noise that gives the generation process a starting point centered around charge balance, and the combination of a per-step soft projection and a final discrete projection for steering the elements toward exact charge balance throughout the generation. We perform extensive experiments on two amorphous materials datasets. Experimental results provide evidence that AMGenC achieves its design goal.

preprint2026arXiv

Beyond Manual Curation: Augmenting Targeted Protein Degradation Databases via Agentic Literature Extraction Workflows

Predictive models in biomedicine depend on structured assay data locked in the text, tables, and supplements of primary publications. This bottleneck is especially acute in targeted protein degradation (TPD), where each assay record must combine compound identity, degradation target, recruiter, assay context, and endpoint values reported across sections, tables, and supplementary files. Inconsistent compound identifiers and incomplete or implicit assay context further demand domain-specific logic that generic LLM pipelines do not provide. Existing molecular glue and PROTAC databases are manually curated and often lack the experimental context required for downstream modeling. We formulate TPD database extraction as a domain-specific curation task and present an expert-in-the-loop LLM workflow, evaluated through a triangular comparison among LLM predictions, standardized baseline records, and expert-annotated ground truth. A lightweight cross-validated prompt-refinement module adapts extraction instructions from scarce expert annotations. With only seven annotated molecular glue publications, the workflow achieved record-level $F_1 = 0.98$ and transferred to PROTACs by terminology substitution alone, maintaining record-level $F_1 > 0.93$. Applied at scale, it expanded molecular glue and PROTAC databases by 81% and 92% records, respectively, with 92% and 82.5% of newly recovered records validated as correct upon expert review. The workflow also recovered kinetic and assay-context information essential for cross-study potency comparison and condition-aware degradation modeling. We release the workflow, prompts, evaluation code, and extracted datasets as resources for TPD data curation and AI-assisted scientific curation more broadly.

preprint2026arXiv

LeMat-GenBench: A Unified Evaluation Framework for Crystal Generative Models

Generative machine learning (ML) models hold great promise for accelerating materials discovery through the inverse design of inorganic crystals, enabling an unprecedented exploration of chemical space. Yet, the lack of standardized evaluation frameworks makes it challenging to evaluate, compare, and further develop these ML models meaningfully. In this work, we introduce LeMat-GenBench, a unified benchmark for generative models of crystalline materials, supported by a set of evaluation metrics designed to better inform model development and downstream applications. We release both an open-source evaluation suite and a public leaderboard on Hugging Face, and benchmark 12 recent generative models. Results reveal that an increase in stability leads to a decrease in novelty and diversity on average, with no model excelling across all dimensions. Altogether, LeMat-GenBench establishes a reproducible and extensible foundation for fair model comparison and aims to guide the development of more reliable, discovery-oriented generative models for crystalline materials.

preprint2026arXiv

MDGYM: Benchmarking AI Agents on Molecular Simulations

The promise of AI-driven scientific discovery hinges on whether AI agents can autonomously design and execute the computational workflows that underpin modern science. Molecular dynamics (MD) simulation presents a natural test bed to stress-test this claim; it requires translating physical intuition into syntactically and semantically correct input scripts, reasoning about initial and boundary conditions, diagnosing numerically unstable trajectories, and interpreting outputs against known physical behavior and laws. We introduce MDGYM, a benchmark of 169 expert-curated MD simulations spanning LAMMPS and GROMACS, two widely used MD packages, across three increasing difficulty levels. We evaluate three agentic frameworks -- Claude Code, Codex, and OpenHands -- with four LLMs, and find that all perform poorly: even the strongest agent solves only 21\% of easy-level tasks, with less than 10\% at higher difficulties. Trajectory analysis reveals a characteristic pattern of failure -- agents successfully invoke simulation machinery but produce physically unstable configurations, fabricate numerical outputs without executing the underlying computation, or abandon tasks prematurely rather than iterating through simulation-specific errors. These failure modes are qualitatively distinct from those observed in general software engineering benchmarks, indicating that fluent code generation does not transfer to grounded physical reasoning.

preprint2023arXiv

Glass Hardness: Predicting Composition and Load Effects via Symbolic Reasoning-Informed Machine Learning

Glass hardness varies in a non-linear fashion with the chemical composition and applied load, a phenomenon known as the indentation size effect (ISE), which is challenging to predict quantitatively. Here, using a curated dataset of over approx. 3000 inorganic glasses from the literature comprising the composition, indentation load, and hardness, we develop machine learning (ML) models to predict the composition and load dependence of Vickers hardness. Interestingly, when tested on new glass compositions unseen during the training, the standard data-driven ML model failed to capture the ISE. To address this gap, we combined an empirical expression (Bernhardt law) to describe the ISE with ML to develop a framework that incorporates the symbolic law representing the domain reasoning in ML, namely Symbolic Reasoning-Informed ML Procedure (SRIMP). We show that the resulting SRIMP outperforms the data-driven ML model in predicting the ISE. Finally, we interpret the SRIMP model to understand the contribution of the glass network formers and modifiers toward composition and load-dependent (ISE) and load-independent hardness. The deconvolution of the hardness into load-dependent and load-independent terms paves the way toward a holistic understanding of composition and ISE in glasses, enabling the accelerated discovery of new glass compositions with targeted hardness.

preprint2022arXiv

Learning the Stress-Strain Fields in Digital Composites using Fourier Neural Operator

Increased demands for high-performance materials have led to advanced composite materials with complex hierarchical designs. However, designing a tailored material microstructure with targeted properties and performance is extremely challenging due to the innumerable design combinations and prohibitive computational costs for physics-based solvers. In this study, we employ a neural operator-based framework, namely Fourier neural operator (FNO) to learn the mechanical response of 2D composites. We show that the FNO exhibits high-fidelity predictions of the complete stress and strain tensor fields for geometrically complex composite microstructures with very few training data and purely based on the microstructure. The model also exhibits zero-shot generalization on unseen arbitrary geometries with high accuracy. Furthermore, the model exhibits zero-shot super-resolution capabilities by predicting high-resolution stress and strain fields directly from low-resolution input configurations. Finally, the model also provides high-accuracy predictions of equivalent measures for stress-strain fields allowing realistic upscaling of the results.

preprint2022arXiv

Unsupervised Graph Neural Network Reveals the Structure--Dynamics Correlation in Disordered Systems

Learning the structure--dynamics correlation in disordered systems is a long-standing problem. Here, we use unsupervised machine learning employing graph neural networks (GNN) to investigate the local structures in disordered systems. We test our approach on 2D binary A65B35 LJ glasses and extract structures corresponding to liquid, supercooled and glassy states at different cooling rates. The neighborhood representation of atoms learned by a GNN in an unsupervised fashion, when clustered, reveal local structures with varying potential energies. These clusters exhibit dynamical heterogeneity in the structure in congruence with their local energy landscape. Altogether, the present study shows that unsupervised graph embedding can reveal the structure--dynamics correlation in disordered structures.

preprint2021arXiv

Looking Through Glass: Knowledge Discovery from Materials Science Literature using Natural Language Processing

Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses' literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA), providing a way to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots are presented using the 'Caption Cluster Plot' (CCP), which provides direct access to the images buried in the papers. Finally, we combine the LDA and CCP with the chemical elements occurring in the manuscript to present an 'Elemental map', a topical and image-wise distribution of chemical elements in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition-structure-processing-property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery.

preprint2021arXiv

Unveiling the Glass Veil: Elucidating the Optical Properties in Glasses with Interpretable Machine Learning

Due to their excellent optical properties, glasses are used for various applications ranging from smartphone screens to telescopes. Developing compositions with tailored Abbe number (Vd) and refractive index (nd), two crucial optical properties, is a major challenge. To this extent, machine learning (ML) approaches have been successfully used to develop composition-property models. However, these models are essentially black-box in nature and suffer from the lack of interpretability. In this paper, we demonstrate the use of ML models to predict the composition-dependent variations of Vd and n at 587.6 nm (nd). Further, using Shapely Additive exPlanations (SHAP), we interpret the ML models to identify the contribution of each of the input components toward a target prediction. We observe that the glass formers such as SiO2, B2O3, and P2O5, and intermediates like TiO2, PbO, and Bi2O3 play a significant role in controlling the optical properties. Interestingly, components that contribute toward increasing the nd are found to decrease the Vd and vice-versa. Finally, we develop the Abbe diagram, also known as the "glass veil", using the ML models, allowing accelerated discovery of new glasses for optical properties beyond the experimental pareto front. Overall, employing explainable ML, we discover the hidden compositional control on the optical properties of oxide glasses.

preprint2020arXiv

On the Allowable or Forbidden Nature of Vapor-Deposited Glasses

Vapor deposition can yield glasses that are more stable than those obtained by the traditional melt-quenching route. However, it remains unclear whether vapor-deposited glasses are "allowable" or "forbidden," that is, if they are equivalent to glasses formed by cooling extremely slowly a liquid or if they differ in nature from melt-quenched glasses. Here, based on reactive molecular dynamics simulation (MD) of silica glasses, we demonstrate that the allowable or forbidden nature of vapor-deposited glasses depends on the temperature of the substrate and, in turn, is found to be encoded in their medium-range order structure.

preprint2020arXiv

Scalable Gaussian Processes for Predicting the Properties of Inorganic Glasses with Large Datasets

Gaussian process regression (GPR) is a useful technique to predict composition--property relationships in glasses as the method inherently provides the standard deviation of the predictions. However, the technique remains restricted to small datasets due to the substantial computational cost associated with it. Here, using a scalable GPR algorithm, namely, kernel interpolation for scalable structured Gaussian processes (KISS-GP) along with massively scalable GP (MSGP), we develop composition--property models for inorganic glasses based on a large dataset with more than 100,000 glass compositions, 37 components, and nine important properties, namely, density, Young's, shear, and bulk moduli, thermal expansion coefficient, Vickers' hardness, refractive index, glass transition temperature, and liquidus temperature. Finally, to accelerate glass design, the models developed here are shared publicly as part of a package, namely, Python for Glass Genomics (PyGGi).