Researcher profile

Chintan Shah

Chintan Shah contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2025arXiv

State-of-the-art Small Language Coder Model: Mify-Coder

We present Mify-Coder, a 2.5B-parameter code model trained on 4.2T tokens using a compute-optimal strategy built on the Mify-2.5B foundation model. Mify-Coder achieves comparable accuracy and safety while significantly outperforming much larger baseline models on standard coding and function-calling benchmarks, demonstrating that compact models can match frontier-grade models in code generation and agent-driven workflows. Our training pipeline combines high-quality curated sources with synthetic data generated through agentically designed prompts, refined iteratively using enterprise-grade evaluation datasets. LLM-based quality filtering further enhances data density, enabling frugal yet effective training. Through disciplined exploration of CPT-SFT objectives, data mixtures, and sampling dynamics, we deliver frontier-grade code intelligence within a single continuous training trajectory. Empirical evidence shows that principled data and compute discipline allow smaller models to achieve competitive accuracy, efficiency, and safety compliance. Quantized variants of Mify-Coder enable deployment on standard desktop environments without requiring specialized hardware.

preprint2022arXiv

Uncertainties in Atomic Data for Modeling Astrophysical Charge Exchange Plasmas

Relevant uncertainties on theoretical atomic data are vital to determine the accuracy of plasma diagnostics in a number of areas including in particular the astrophysical study. We present a new calculation of the uncertainties on the present theoretical ion-impact charge exchange atomic data and X-ray spectra based on a set of comparisons with the existing laboratory data obtained in historical merged-beam, cold-target recoil-ion momentum spectroscopy, and electron beam ion traps experiments. The average systematic uncertainties are found to be 35-88% on the total cross sections, and 57-75% on the characteristic line ratios. The model deviation increases as the collision energy decreases. The errors on total cross sections further induce a significant uncertainty to the calculation of ionization balance for low temperature collisional plasmas. Substantial improvements of the atomic database and dedicated laboratory measurements are needed to get the current models ready for the X-ray spectra from the next X-ray spectroscopic mission.

preprint2022arXiv

X-ray spectra of the Fe-L complex III: systematic uncertainties in the atomic data

There has been a growing request from the X-ray astronomy community for a quantitative estimate of systematic uncertainties originating from the atomic data used in plasma codes. Though there have been several studies looking into atomic data uncertainties using theoretical calculations, in general, there is no commonly accepted solution for this task. We present a new approach for estimating uncertainties in the line emissivities for the current models of collisional plasma, mainly based upon dedicated analysis of observed high resolution spectra of stellar coronae and galaxy clusters. We find that the systematic uncertainties of the observed lines consistently show anti-correlation with the model line fluxes, after properly accounting for the additional uncertainties from the ion concentration calculation. The strong lines in the spectra are in general better reproduced, indicating that the atomic data and modeling of the main transitions are more accurate than those for the minor ones. This underlying anti-correlation is found to be roughly independent on source properties, line positions, ion species, and the line formation processes. We further apply our method to the simulated XRISM and Athena observations of collisional plasma sources and discuss the impact of uncertainties on the interpretation of these spectra. The typical uncertainties are 1-2% on temperature and 3-20% on abundances of O, Ne, Fe, Mg, and Ni.

preprint2020arXiv

Finding Patient Zero: Learning Contagion Source with Graph Neural Networks

Locating the source of an epidemic, or patient zero (P0), can provide critical insights into the infection's transmission course and allow efficient resource allocation. Existing methods use graph-theoretic centrality measures and expensive message-passing algorithms, requiring knowledge of the underlying dynamics and its parameters. In this paper, we revisit this problem using graph neural networks (GNNs) to learn P0. We establish a theoretical limit for the identification of P0 in a class of epidemic models. We evaluate our method against different epidemic models on both synthetic and a real-world contact network considering a disease with history and characteristics of COVID-19. % We observe that GNNs can identify P0 close to the theoretical bound on accuracy, without explicit input of dynamics or its parameters. In addition, GNN is over 100 times faster than classic methods for inference on arbitrary graph topologies. Our theoretical bound also shows that the epidemic is like a ticking clock, emphasizing the importance of early contact-tracing. We find a maximum time after which accurate recovery of the source becomes impossible, regardless of the algorithm used.

preprint2020arXiv

X-ray spectra of the Fe-L complex II: atomic data constraints from EBIT experiment and X-ray grating observations of Capella

The Hitomi results for the Perseus cluster have shown that accurate atomic models are essential to the success of X-ray spectroscopic missions, and just as important as knowledge on instrumental calibration and astrophysical modeling. Preparing the models requires a multifaceted approach, including theoretical calculations, laboratory measurements, and calibration using real observations. In a previous paper, we presented a calculation of the electron impact cross sections on the transitions forming the Fe-L complex. In the present work, we systematically test the calculation against cross sections of ions measured in an electron beam ion trap experiment. A two-dimensional analysis in the electron beam energies and X-ray photon energies is utilized to disentangle radiative channels following dielectronic recombination, direct electron-impact excitation, and resonant excitation processes in the experimental data. The data calibrated through laboratory measurements are further fed into global modeling of the Chandra grating spectrum of Capella. We investigate and compare the fit quality, as well as sensitivity of the derived physical parameters to the underlying atomic data and the astrophysical plasma modeling. We further list the potential areas of disagreement between the observation and the present calculations, which in turn calls for renewed efforts in theoretical calculations and targeted laboratory measurements.