Source author record

Gautam Singh

Gautam Singh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence Computation and Language Biomolecules cond-mat.soft eess.IV

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Illiterate DALL-E Learns to Compose

Although DALL-E has shown an impressive ability of composition-based systematic generalization in image generation, it requires the dataset of text-image pairs and the compositionality is provided by the text. In contrast, object-centric representation models like the Slot Attention model learn composable representations without the text prompt. However, unlike DALL-E its ability to systematically generalize for zero-shot generation is significantly limited. In this paper, we propose a simple but novel slot-based autoencoding architecture, called SLATE, for combining the best of both worlds: learning object-centric representations that allows systematic generalization in zero-shot image generation without text. As such, this model can also be seen as an illiterate DALL-E model. Unlike the pixel-mixture decoders of existing object-centric representation models, we propose to use the Image GPT decoder conditioned on the slots for capturing complex interactions among the slots and pixels. In experiments, we show that this simple and easy-to-implement architecture not requiring a text prompt achieves significant improvement in in-distribution and out-of-distribution (zero-shot) image generation and qualitatively comparable or better slot-attention structure than the models based on mixture decoders.

preprint2022arXiv

Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos

Unsupervised object-centric learning aims to represent the modular, compositional, and causal structure of a scene as a set of object representations and thereby promises to resolve many critical limitations of traditional single-vector representations such as poor systematic generalization. Although there have been many remarkable advances in recent years, one of the most critical problems in this direction has been that previous methods work only with simple and synthetic scenes but not with complex and naturalistic images or videos. In this paper, we propose STEVE, an unsupervised model for object-centric learning in videos. Our proposed model makes a significant advancement by demonstrating its effectiveness on various complex and naturalistic videos unprecedented in this line of research. Interestingly, this is achieved by neither adding complexity to the model architecture nor introducing a new objective or weak supervision. Rather, it is achieved by a surprisingly simple architecture that uses a transformer-based image decoder conditioned on slots and the learning objective is simply to reconstruct the observation. Our experiment results on various complex and naturalistic videos show significant improvements compared to the previous state-of-the-art.

preprint2020arXiv

Fair Transfer of Multiple Style Attributes in Text

To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve engagement with its audience. This manner of changing the style of written text has gained significant attention in recent years. Yet these past research works largely cater to the transfer of single style attributes. The disadvantage of focusing on a single style alone is that this often results in target text where other existing style attributes behave unpredictably or are unfairly dominated by the new style. To counteract this behavior, it would be nice to have a style transfer mechanism that can transfer or control multiple styles simultaneously and fairly. Through such an approach, one could obtain obfuscated or written text incorporated with a desired degree of multiple soft styles such as female-quality, politeness, or formalness. In this work, we demonstrate that the transfer of multiple styles cannot be achieved by sequentially performing multiple single-style transfers. This is because each single style-transfer step often reverses or dominates over the style incorporated by a previous transfer step. We then propose a neural network architecture for fairly transferring multiple style attributes in a given text. We test our architecture on the Yelp data set to demonstrate our superior performance as compared to existing one-style transfer steps performed in a sequence.

preprint2020arXiv

Robustifying Sequential Neural Processes

When tasks change over time, meta-transfer learning seeks to improve the efficiency of learning a new task via both meta-learning and transfer-learning. While the standard attention has been effective in a variety of settings, we question its effectiveness in improving meta-transfer learning since the tasks being learned are dynamic and the amount of context can be substantially smaller. In this paper, using a recently proposed meta-transfer learning model, Sequential Neural Processes (SNP), we first empirically show that it suffers from a similar underfitting problem observed in the functions inferred by Neural Processes. However, we further demonstrate that unlike the meta-learning setting, the standard attention mechanisms are not effective in meta-transfer setting. To resolve, we propose a new attention mechanism, Recurrent Memory Reconstruction (RMR), and demonstrate that providing an imaginary context that is recurrently updated and reconstructed with interaction is crucial in achieving effective attention for meta-transfer learning. Furthermore, incorporating RMR into SNP, we propose Attentive Sequential Neural Processes-RMR (ASNP-RMR) and demonstrate in various tasks that ASNP-RMR significantly outperforms the baselines.

preprint2020arXiv

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

The ability to decompose complex multi-object scenes into meaningful abstractions like objects is fundamental to achieve higher-level cognition. Previous approaches for unsupervised object-oriented scene representation learning are either based on spatial-attention or scene-mixture approaches and limited in scalability which is a main obstacle towards modeling real-world scenes. In this paper, we propose a generative latent variable model, called SPACE, that provides a unified probabilistic modeling framework that combines the best of spatial-attention and scene-mixture approaches. SPACE can explicitly provide factorized object representations for foreground objects while also decomposing background segments of complex morphology. Previous models are good at either of these, but not both. SPACE also resolves the scalability problems of previous methods by incorporating parallel spatial-attention and thus is applicable to scenes with a large number of objects without performance degradations. We show through experiments on Atari and 3D-Rooms that SPACE achieves the above properties consistently in comparison to SPAIR, IODINE, and GENESIS. Results of our experiments can be found on our project website: https://sites.google.com/view/space-project-page

preprint2016arXiv

Cross-Lingual Predicate Mapping Between Linked Data Ontologies

Ontologies in different natural languages often differ in quality in terms of richness of schema or richness of internal links. This difference is markedly visible when comparing a rich English language ontology with a non-English language counterpart. Discovering alignment between them is a useful endeavor as it serves as a starting point in bridging the disparity. In particular, our work is motivated by the absence of inter-language links for predicates in the localised versions of DBpedia. In this paper, we propose and demonstrate an ad-hoc system to find possible owl:equivalentProperty links between predicates in ontologies of different natural languages. We seek to achieve this mapping by using pre-existing inter-language links of the resources connected by the given predicate. Thus, our methodology stresses on semantic similarity rather than lexical. Moreover, through an evaluation, we show that our system is capable of outperforming a baseline system that is similar to the one used in recent OAEI campaigns.

preprint2015arXiv

Orientational Order in the Nematic and Heliconical Nematic Liquid Crystals

X-ray scattering and polarized microscopic studies of the structure and order parameters in the nematic (N) and heliconical, or the twist-bend nematic (Ntb), phase have been performed as a function of temperature. The nematic orientational order parameters <P2(cosθ)> and <P4(cosθ)> in the nematic phases of CB7CB and its mixtures with less than 20 wt% CB6CB reveal that they both increase with decreasing temperature in the N phase. Both order parameters decrease upon entering the Ntb phase and <P4(cosθ)> becomes negative providing a direct confirmation of the conical molecular orientational distribution. The heliconical tilt angle, estimated from the orientational distribution functions (ODFs), in all cases increases from zero at the N - Ntb transition to approximately 27° at about 40 K below the transition, in excellent agreement with freeze fracture transmission electron microscopy results of Chen and the birefringence results of Meyer. The growth of the tilt angle in the Ntb phase follows a single power law with exponents between ~0.09 +/- 0.01 to -0.12 +/- 0.01, which is far from the expected tricritical or mean field exponents of 0.25 or 0.5. The temperature dependence of the tilt angle calculated from the ODFs also is in good qualitative agreement with the values estimated from optical studies of their ropelike textures within adjacent blocks of left- and right-handed twist in homogeneously aligned cells.

preprint2014arXiv

Arginine-Phosphate Salt Bridges Between Histones and DNA: Intermolecular Actuators that Control Nucleosome Architecture

Structural bioinformatics and van der Waals density functional theory are combined to investigate the mechanochemical impact of a major class of histone-DNA interactions, namely the formation of salt bridges between arginine residues in histones and phosphate groups on the DNA backbone. Principal component analysis reveals that the configurational fluctuations of the sugar-phosphate backbone display sequence-specific variability, and clustering of nucleosomal crystal structures identifies two major salt bridge configurations: a monodentate form in which the arginine end-group guanidinium only forms one hydrogen bond with the phosphate, and a bidentate form in which it forms two. Density functional theory calculations highlight that the combination of sequence, denticity and salt bridge positioning enable the histones to tunably activate specific backbone deformations via mechanochemical stress. The results suggest that selection for specific placements of van der Waals contacts, with high-precision control of the spatial distribution of intermolecular forces, may serve as an underlying evolutionary design principle for the structure and function of nucleosomes, a conjecture that is corroborated by previous experimental studies.

Gautam Singh

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Illiterate DALL-E Learns to Compose

Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos

Fair Transfer of Multiple Style Attributes in Text

Robustifying Sequential Neural Processes

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

Cross-Lingual Predicate Mapping Between Linked Data Ontologies

Orientational Order in the Nematic and Heliconical Nematic Liquid Crystals

Arginine-Phosphate Salt Bridges Between Histones and DNA: Intermolecular Actuators that Control Nucleosome Architecture