Researcher profile

Nitesh V. Chawla

Nitesh V. Chawla contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration - Learning from Cheap, Optimizing Expensive

Effectively configuring scalable large language model (LLM) experiments, spanning architecture design, hyperparameter tuning, and beyond, is crucial for advancing LLM research, as poor configuration choices can waste substantial computational resources and prevent models from realizing their full potential. Prior automated methods are designed for low-cost settings where repeated trial and error is feasible, but scalable LLM experiments are too expensive for such extensive iteration. To our knowledge, no work has addressed the automation of high-cost LLM experiment configurations, leaving this problem labor-intensive and dependent on expert intuition. Motivated by this gap, we propose AutoLLMResearch, an agentic framework that mimics how human researchers learn generalizable principles from low-fidelity experiments and extrapolate to efficiently identify promising configurations in expensive LLM settings. The core challenge is how to enable an agent to learn, through interaction with a multi-fidelity experimental environment that captures the structure of the LLM configuration landscape. To achieve this, we propose a systematic framework with two key components: 1) LLMConfig-Gym, a multi-fidelity environment encompassing four critical LLM experiment tasks, supported by over one million GPU hours of verifiable experiment outcomes; 2) A structured training pipeline that formulates configuration research as a long-horizon Markov Decision Process and accordingly incentivizes cross-fidelity extrapolation reasoning. Extensive evaluation against diverse strong baselines on held-out experiments demonstrates the effectiveness, generalization, and interpretability of our framework, supporting its potential as a practical and general solution for scalable real-world LLM experiment automation.

preprint2026arXiv

Do LLMs have core beliefs?

The rise of Large Language Models (LLMs) has sparked debate about whether these systems exhibit human-level cognition. In this debate, little attention has been paid to a structural component of human cognition: core beliefs, truths that provide a foundation around which we can build a worldview. These commitments usually resist debunking, as abandoning them would represent a fundamental shift in how we see reality. In this paper, we ask whether LLMs hold anything akin to core commitments. Using a probing framework we call Adversarial Dialogue Trees (ADTs) over five domains (science, history, geography, biology, and mathematics), we find that most LLMs fail to maintain a stable worldview. Though some recent models showed improved stability, they still eventually failed to maintain key commitments under conversational pressure. These results document an improvement in argumentative skills across model generations but indicate that all current models lack a key component of human-level cognition.

preprint2026arXiv

Geometry of Knowledge Allows Extending Diversity Boundaries of Large Language Models

Starting from the hypothesis that knowledge in semantic space is organized along structured manifolds, we argue that this geometric structure renders the space explorable. By traversing it and using the resulting continuous representations to condition an LLM's generation distribution, we can systematically expand the model's reachable semantic range. We introduce a framework that requires no modification of LLM parameters and operationalizes this idea by constructing a conditioning distribution from a small set of diverse anchor generations. This distribution conditions LLM's generation via an xRAG-style projector. Our experiments demonstrate that this manifold-based conditioning substantially increases generative diversity, with direct benefits for enhancing divergent thinking, a core facet of creativity, in language models.

preprint2023arXiv

Graph Neural Prompting with Large Language Models

Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various language modeling tasks. However, they still exhibit inherent limitations in precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost. Therefore, how to enhance pre-trained LLMs using grounded knowledge, e.g., retrieval-augmented generation, remains an open question. In this work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. GNP encompasses various designs, including a standard graph neural network encoder, a cross-modality pooling module, a domain projector, and a self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings. Code is available at https://github.com/meettyj/GNP.

preprint2022arXiv

Few-Shot Learning on Graphs

Graph representation learning has attracted tremendous attention due to its remarkable performance in many real-world applications. However, prevailing supervised graph representation learning models for specific tasks often suffer from label sparsity issue as data labeling is always time and resource consuming. In light of this, few-shot learning on graphs (FSLG), which combines the strengths of graph representation learning and few-shot learning together, has been proposed to tackle the performance degradation in face of limited annotated data challenge. There have been many studies working on FSLG recently. In this paper, we comprehensively survey these work in the form of a series of methods and applications. Specifically, we first introduce FSLG challenges and bases, then categorize and summarize existing work of FSLG in terms of three major graph mining tasks at different granularity levels, i.e., node, edge, and graph. Finally, we share our thoughts on some future research directions of FSLG. The authors of this survey have contributed significantly to the AI literature on FSLG over the last few years.

preprint2022arXiv

Recipe2Vec: Multi-modal Recipe Representation Learning with Graph Neural Networks

Learning effective recipe representations is essential in food studies. Unlike what has been developed for image-based recipe retrieval or learning structural text embeddings, the combined effect of multi-modal information (i.e., recipe images, text, and relation data) receives less attention. In this paper, we formalize the problem of multi-modal recipe representation learning to integrate the visual, textual, and relational information into recipe embeddings. In particular, we first present Large-RG, a new recipe graph data with over half a million nodes, making it the largest recipe graph to date. We then propose Recipe2Vec, a novel graph neural network based recipe embedding model to capture multi-modal information. Additionally, we introduce an adversarial attack strategy to ensure stable learning and improve performance. Finally, we design a joint objective function of node classification and adversarial learning to optimize the model. Extensive experiments demonstrate that Recipe2Vec outperforms state-of-the-art baselines on two classic food study tasks, i.e., cuisine category classification and region prediction. Dataset and codes are available at https://github.com/meettyj/Recipe2Vec.

preprint2022arXiv

RecipeRec: A Heterogeneous Graph Learning Model for Recipe Recommendation

Recipe recommendation systems play an essential role in helping people decide what to eat. Existing recipe recommendation systems typically focused on content-based or collaborative filtering approaches, ignoring the higher-order collaborative signal such as relational structure information among users, recipes and food items. In this paper, we formalize the problem of recipe recommendation with graphs to incorporate the collaborative signal into recipe recommendation through graph modeling. In particular, we first present URI-Graph, a new and large-scale user-recipe-ingredient graph. We then propose RecipeRec, a novel heterogeneous graph learning model for recipe recommendation. The proposed model can capture recipe content and collaborative signal through a heterogeneous graph neural network with hierarchical attention and an ingredient set transformer. We also introduce a graph contrastive augmentation strategy to extract informative graph knowledge in a self-supervised manner. Finally, we design a joint objective function of recommendation and contrastive learning to optimize the model. Extensive experiments demonstrate that RecipeRec outperforms state-of-the-art methods for recipe recommendation. Dataset and codes are available at https://github.com/meettyj/RecipeRec.

preprint2022arXiv

SD2: Slicing and Dicing Scholarly Data for Interactive Evaluation of Academic Performance

Comprehensively evaluating and comparing researchers' academic performance is complicated due to the intrinsic complexity of scholarly data. Different scholarly evaluation tasks often require the publication and citation data to be investigated in various manners. In this paper, we present an interactive visualization framework, SD2, to enable flexible data partition and composition to support various analysis requirements within a single system. SD2 features the hierarchical histogram, a novel visual representation for flexibly slicing and dicing the data, allowing different aspects of scholarly performance to be studied and compared. We also leverage the state-of-the-art set visualization technique to select individual researchers or combine multiple scholars for comprehensive visual comparison. We conduct multiple rounds of expert evaluation to study the effectiveness and usability of SD2 and revise the design and system implementation accordingly. The effectiveness of SD2 is demonstrated via multiple usage scenarios with each aiming to answer a specific, commonly raised question.

preprint2021arXiv

Few-Shot Graph Learning for Molecular Property Prediction

The recent success of graph neural networks has significantly boosted molecular property prediction, advancing activities such as drug discovery. The existing deep neural network methods usually require large training dataset for each property, impairing their performances in cases (especially for new molecular properties) with a limited amount of experimental data, which are common in real situations. To this end, we propose Meta-MGNN, a novel model for few-shot molecular property prediction. Meta-MGNN applies molecular graph neural network to learn molecular representation and builds a meta-learning framework for model optimization. To exploit unlabeled molecular information and address task heterogeneity of different molecular properties, Meta-MGNN further incorporates molecular structure, attribute based self-supervised modules and self-attentive task weights into the former framework, strengthening the whole learning model. Extensive experiments on two public multi-property datasets demonstrate that Meta-MGNN outperforms a variety of state-of-the-art methods.

preprint2020arXiv

AttrE2vec: Unsupervised Attributed Edge Representation Learning

Representation learning has overcome the often arduous and manual featurization of networks through (unsupervised) feature learning as it results in embeddings that can apply to a variety of downstream learning tasks. The focus of representation learning on graphs has focused mainly on shallow (node-centric) or deep (graph-based) learning approaches. While there have been approaches that work on homogeneous and heterogeneous networks with multi-typed nodes and edges, there is a gap in learning edge representations. This paper proposes a novel unsupervised inductive method called AttrE2Vec, which learns a low-dimensional vector representation for edges in attributed networks. It systematically captures the topological proximity, attributes affinity, and feature similarity of edges. Contrary to current advances in edge embedding research, our proposal extends the body of methods providing representations for edges, capturing graph attributes in an inductive and unsupervised manner. Experimental results show that, compared to contemporary approaches, our method builds more powerful edge vector representations, reflected by higher quality measures (AUC, accuracy) in downstream tasks as edge classification and edge clustering. It is also confirmed by analyzing low-dimensional embedding projections.

preprint2020arXiv

Calendar Graph Neural Networks for Modeling Time Structures in Spatiotemporal User Behaviors

User behavior modeling is important for industrial applications such as demographic attribute prediction, content recommendation, and target advertising. Existing methods represent behavior log as a sequence of adopted items and find sequential patterns; however, concrete location and time information in the behavior log, reflecting dynamic and periodic patterns, joint with the spatial dimension, can be useful for modeling users and predicting their characteristics. In this work, we propose a novel model based on graph neural networks for learning user representations from spatiotemporal behavior data. A behavior log comprises a sequence of sessions; and a session has a location, start time, end time, and a sequence of adopted items. Our model's architecture incorporates two networked structures. One is a tripartite network of items, sessions, and locations. The other is a hierarchical calendar network of hour, week, and weekday nodes. It first aggregates embeddings of location and items into session embeddings via the tripartite network, and then generates user embeddings from the session embeddings via the calendar structure. The user embeddings preserve spatial patterns and temporal patterns of a variety of periodicity (e.g., hourly, weekly, and weekday patterns). It adopts the attention mechanism to model complex interactions among the multiple patterns in user behaviors. Experiments on real datasets (i.e., clicks on news articles in a mobile app) show our approach outperforms strong baselines for predicting missing demographic attributes.

preprint2020arXiv

Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network

Noun phrases and relational phrases in Open Knowledge Bases are often not canonical, leading to redundant and ambiguous facts. In this work, we integrate structural information (from which tuple, which sentence) and semantic information (semantic similarity) to do the canonicalization. We represent the two types of information as a multi-layered graph: the structural information forms the links across the sentence, relational phrase, and noun phrase layers; the semantic information forms weighted intra-layer links for each layer. We propose a graph neural network model to aggregate the representations of noun phrases and relational phrases through the multi-layered meta-graph structure. Experiments show that our model outperforms existing approaches on a public datasets in general domain.

preprint2020arXiv

Graph Few-shot Learning via Knowledge Transfer

Towards the challenging problem of semi-supervised node classification, there have been extensive studies. As a frontier, Graph Neural Networks (GNNs) have aroused great interest recently, which update the representation of each node by aggregating information of its neighbors. However, most GNNs have shallow layers with a limited receptive field and may not achieve satisfactory performance especially when the number of labeled nodes is quite small. To address this challenge, we innovatively propose a graph few-shot learning (GFL) algorithm that incorporates prior knowledge learned from auxiliary graphs to improve classification accuracy on the target graph. Specifically, a transferable metric space characterized by a node embedding and a graph-specific prototype embedding function is shared between auxiliary graphs and the target, facilitating the transfer of structural knowledge. Extensive experiments and ablation studies on four real-world graph datasets demonstrate the effectiveness of our proposed model.

preprint2020arXiv

Jointly Predicting Job Performance, Personality, Cognitive Ability, Affect, and Well-Being

Assessment of job performance, personalized health and psychometric measures are domains where data-driven and ubiquitous computing exhibits the potential of a profound impact in the future. Existing techniques use data extracted from questionnaires, sensors (wearable, computer, etc.), or other traits, to assess well-being and cognitive attributes of individuals. However, these techniques can neither predict individual's well-being and psychological traits in a global manner nor consider the challenges associated to processing the data available, that is incomplete and noisy. In this paper, we create a benchmark for predictive analysis of individuals from a perspective that integrates: physical and physiological behavior, psychological states and traits, and job performance. We design data mining techniques as benchmark and uses real noisy and incomplete data derived from wearable sensors to predict 19 constructs based on 12 standardized well-validated tests. The study included 757 participants who were knowledge workers in organizations across the USA with varied work roles. We developed a data mining framework to extract the meaningful predictors for each of the 19 variables under consideration. Our model is the first benchmark that combines these various instrument-derived variables in a single framework to understand people's behavior by leveraging real uncurated data from wearable, mobile, and social media sources. We verify our approach experimentally using the data obtained from our longitudinal study. The results show that our framework is consistently reliable and capable of predicting the variables under study better than the baselines when prediction is restricted to the noisy, incomplete data.

preprint2020arXiv

Learning Attribute-Structure Co-Evolutions in Dynamic Graphs

Most graph neural network models learn embeddings of nodes in static attributed graphs for predictive analysis. Recent attempts have been made to learn temporal proximity of the nodes. We find that real dynamic attributed graphs exhibit complex co-evolution of node attributes and graph structure. Learning node embeddings for forecasting change of node attributes and birth and death of links over time remains an open problem. In this work, we present a novel framework called CoEvoGNN for modeling dynamic attributed graph sequence. It preserves the impact of earlier graphs on the current graph by embedding generation through the sequence. It has a temporal self-attention mechanism to model long-range dependencies in the evolution. Moreover, CoEvoGNN optimizes model parameters jointly on two dynamic tasks, attribute inference and link prediction over time. So the model can capture the co-evolutionary patterns of attribute change and link formation. This framework can adapt to any graph neural algorithms so we implemented and investigated three methods based on it: CoEvoGCN, CoEvoGAT, and CoEvoSAGE. Experiments demonstrate the framework (and its methods) outperform strong baselines on predicting an entire unseen graph snapshot of personal attributes and interpersonal links in dynamic social graphs and financial graphs.

preprint2020arXiv

Pandemic Pulse: Unraveling and Modeling Social Signals during the COVID-19 Pandemic

We present and begin to explore a collection of social data that represents part of the COVID-19 pandemic's effects on the United States. This data is collected from a range of sources and includes longitudinal trends of news topics, social distancing behaviors, community mobility changes, web searches, and more. This multimodal effort enables new opportunities for analyzing the impacts such a pandemic has on the pulse of society. Our preliminary results show that the number of COVID-19-related news articles published immediately after the World Health Organization declared the pandemic on March 11, and that since that time have steadily decreased---regardless of changes in the number of cases or public policies. Additionally, we found that politically moderate and scientifically-grounded sources have, relative to baselines measured before the beginning of the pandemic, published a lower proportion of COVID-19 news than more politically extreme sources. We suggest that further analysis of these multimodal signals could produce meaningful social insights and present an interactive dashboard to aid further exploration.