Source author record

Sachin Goyal

Sachin Goyal appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Biological Physics physics.comp-ph Biomolecules Computation and Language cond-mat.soft math-ph math.DS math.MP nlin.PS

Catalog footprint

What is connected

7works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting

Pretraining optimizers are tuned to produce the strongest possible base model, on the assumption that a stronger starting point yields a stronger model after subsequent changes like post-training and quantization. This overlooks the geometry of the base model which controls how much of the base model's capabilities survive subsequent parameter updates. We study three pretraining optimization approaches that bias optimization toward flatter minima: Sharpness-Aware Minimization (SAM), large learning rates, and shortened learning rate annealing periods. Across model sizes ranging from 20M to 150M parameters, we find that these interventions consistently improve downstream performance after post-training on five common datasets with up to 80% less forgetting. These principles hold at scale: a short SAM mid-training phase applied to an existing OLMo-2-1B checkpoint reduces forgetting by 31% after MetaMath post-training and by 40% after 4-bit quantization.

preprint2022arXiv

MET: Masked Encoding for Tabular Data

We consider the task of self-supervised representation learning (SSL) for tabular data: tabular-SSL. Typical contrastive learning based SSL methods require instance-wise data augmentations which are difficult to design for unstructured tabular data. Existing tabular-SSL methods design such augmentations in a relatively ad-hoc fashion and can fail to capture the underlying data manifold. Instead of augmentations based approaches for tabular-SSL, we propose a new reconstruction based method, called Masked Encoding for Tabular Data (MET), that does not require augmentations. MET is based on the popular MAE approach for vision-SSL [He et al., 2021] and uses two key ideas: (i) since each coordinate in a tabular dataset has a distinct meaning, we need to use separate representations for all coordinates, and (ii) using an adversarial reconstruction loss in addition to the standard one. Empirical results on five diverse tabular datasets show that MET achieves a new state of the art (SOTA) on all of these datasets and improves up to 9% over current SOTA methods. We shed more light on the working of MET via experiments on carefully designed simple datasets.

preprint2020arXiv

DROCC: Deep Robust One-Class Classification

Classical approaches for one-class problems such as one-class SVM and isolation forest require careful feature engineering when applied to structured domains like images. State-of-the-art methods aim to leverage deep learning to learn appropriate features via two main approaches. The first approach based on predicting transformations (Golan & El-Yaniv, 2018; Hendrycks et al., 2019a) while successful in some domains, crucially depends on an appropriate domain-specific set of transformations that are hard to obtain in general. The second approach of minimizing a classical one-class loss on the learned final layer representations, e.g., DeepSVDD (Ruff et al., 2018) suffers from the fundamental drawback of representation collapse. In this work, we propose Deep Robust One-Class Classification (DROCC) that is both applicable to most standard domains without requiring any side-information and robust to representation collapse. DROCC is based on the assumption that the points from the class of interest lie on a well-sampled, locally linear low dimensional manifold. Empirical evaluation demonstrates that DROCC is highly effective in two different one-class problem settings and on a range of real-world datasets across different domains: tabular data, images (CIFAR and ImageNet), audio, and time-series, offering up to 20% increase in accuracy over the state-of-the-art in anomaly detection. Code is available at https://github.com/microsoft/EdgeML.

preprint2020arXiv

Flapping, swirling and flipping: Non-linear dynamics of pre-stressed active filaments

Initially straight slender elastic rods with geometrically constrained ends buckle and form stable two-dimensional shapes when compressed by bringing the ends together. It is also known that beyond a critical value of the pre-stress, clamped rods transition to bent, twisted three-dimensional equilibrium shapes. Recently, we showed that pre-stressed planar shapes when immersed in a dissipative fluid and animated by nonconservative follower forces exhibit stable large-amplitude flapping oscillations. Here, we use time-stepper methods to analyze the three-dimensional instabilities and dynamics of pre-stressed planar and non-planar filament configurations when subject to active follower forces and dissipative fluid drag. First, we find that type of boundary constraint determines the nature of the non-linear patterns following instability. When the filament is clamped at one end and pinned at the other with follower forces directed towards the clamped end, we observe only stable planar (flapping) oscillations termed flapping result. When both ends are clamped however, we observe a secondary instability wherein planar oscillations are destabilized by off-planar perturbations and result in fully three-dimensional swirling patterns characterized by two distinct time-scales. The first time scale characterizes continuous and unidirectional swirling rotation around the end-to-end axis. The second time scale captures the rate at which the direction of swirling reverses or flips. The overall time over which the direction of swirling flips is very short compared to the long times over which the filament swirls in the same direction. Computations indicate that the reversal of swirling oscillations resembles relaxation oscillations with each cycle initiated by a sudden jump in torsional deformation and then followed by a period of gradual decrease in net torsion until the next cycle of variations.

preprint2010arXiv

Constitutive-law Modeling of Microfilaments from their Discrete-Structure Simulations - A Method based on an Inverse Approach Applied to a Static Rod Model

Twisting and bending deformations are crucial to the biological functions of microfilaments such as DNA molecules. Although continuum-rod models have emerged as efficient tools to describe the nonlinear dynamics of these deformations, a major roadblock in the continuum-mechanics-based description of microfilaments is the accurate modeling of the constitutive law, which follows from its atomistic structure and bond-stiffnesses. Since first-principle derivation of the constitutive law from atomistic structure is impractical and so are direct experimental measurements due to the small length-scales, a natural alternative is to estimate the constitutive law from discrete-structure simulations such as molecular-dynamics (MD) simulations. In this paper, we present a two-step inverse method for estimating the constitutive law using rod theory and data generated from discrete-structure simulations. We illustrate the method on a filament with an artificial and simplistic discrete-structure. We simulate its deformation in response to a prescribed loading using a multi-body dynamics (MBD) solver. Using data generated from the MBD solver, we first estimate the curvature of the filament and subsequently use it in the two-step method to estimate the effective constitutive-law relationship between the restoring moment and curvature. Finally, we also illustrate how the estimated constitutive law can be tested under independent loading conditions.

preprint2010arXiv

Estimation of Nonlinear Three-dimensional Constitutive Law for DNA Molecules

Long length-scale structural deformations of DNA play a central role in many biological processes including gene expression. The elastic rod model, which uses a continuum approximation, has emerged as a viable tool to model deformations of DNA molecules. The elastic rod model predictions are however very sensitive to the constitutive law (material properties) of the molecule, which in turn, vary along the molecules length according to its base-pair sequence. Identification of the nonlinear sequence-dependent constitutive law from experimental data and feasible molecular dynamics simulations remains a significant challenge. In this paper, we develop techniques to use elastic rod model equations in combination with limited experimental measurements or high-fidelity molecular dynamics simulation data to estimate the nonlinear constitutive law governing DNA molecules. We first cast the elastic rod model equations in state-space form and express the effect of the unknown constitutive law as an unknown input to the system. We then develop a two-step technique to estimate the unknown constitutive law. We discuss various generalizations and investigate the robustness of this technique through simulations.

preprint2007arXiv

Nonlinear dynamic intertwining of rods with self-contact

Twisted marine cables on the sea floor can form highly contorted three-dimensional loops that resemble tangles. Such tangles or hockles are topologically equivalent to the plectomenes that form in supercoiled DNA molecules. The dynamic evolution of these intertwined loops is studied herein using a computational rod model that explicitly accounts for dynamic self-contact. Numerical solutions are presented for an illustrative example of a long rod subjected to increasing twist at one end. The solutions reveal the dynamic evolution of the rod from an initially straight state, through a buckled state in the approximate form of a helix, through the dynamic collapse of this helix into a near-planar loop with one site of self-contact, and the subsequent intertwining of this loop with multiple sites of self-contact. This evolution is controlled by the dynamic conversion of torsional strain energy to bending strain energy or, alternatively by the dynamic conversion of twist (Tw) to writhe (Wr). KEY WORDS Rod Dynamics, Self-contact, Intertwining, DNA Supercoiling, Cable Hockling

Sachin Goyal

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting

MET: Masked Encoding for Tabular Data

DROCC: Deep Robust One-Class Classification

Flapping, swirling and flipping: Non-linear dynamics of pre-stressed active filaments

Constitutive-law Modeling of Microfilaments from their Discrete-Structure Simulations - A Method based on an Inverse Approach Applied to a Static Rod Model

Estimation of Nonlinear Three-dimensional Constitutive Law for DNA Molecules

Nonlinear dynamic intertwining of rods with self-contact