Source author record

Rumen Dangovski

Rumen Dangovski appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision math.AC math.RA physics.app-ph Computation and Language cond-mat.mtrl-sci eess.IV Neural and Evolutionary Computing physics.optics

Catalog footprint

What is connected

6works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings

We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings. DiffCSE learns sentence embeddings that are sensitive to the difference between the original sentence and an edited sentence, where the edited sentence is obtained by stochastically masking out the original sentence and then sampling from a masked language model. We show that DiffSCE is an instance of equivariant contrastive learning (Dangovski et al., 2021), which generalizes contrastive learning and learns representations that are insensitive to certain types of augmentations and sensitive to other "harmful" types of augmentations. Our experiments show that DiffCSE achieves state-of-the-art results among unsupervised sentence representation learning methods, outperforming unsupervised SimCSE by 2.3 absolute points on semantic textual similarity tasks.

preprint2022arXiv

Equivariant Contrastive Learning

In state-of-the-art self-supervised learning (SSL) pre-training produces semantically good representations by encouraging them to be invariant under meaningful transformations prescribed from human knowledge. In fact, the property of invariance is a trivial instance of a broader class called equivariance, which can be intuitively understood as the property that representations transform according to the way the inputs transform. Here, we show that rather than using only invariance, pre-training that encourages non-trivial equivariance to some transformations, while maintaining invariance to other transformations, can be used to improve the semantic quality of representations. Specifically, we extend popular SSL methods to a more general framework which we name Equivariant Self-Supervised Learning (E-SSL). In E-SSL, a simple additional pre-training objective encourages equivariance by predicting the transformations applied to the input. We demonstrate E-SSL's effectiveness empirically on several popular computer vision benchmarks, e.g. improving SimCLR to 72.5% linear probe accuracy on ImageNet. Furthermore, we demonstrate usefulness of E-SSL for applications beyond computer vision; in particular, we show its utility on regression problems in photonics science. Our code, datasets and pre-trained models are available at https://github.com/rdangovs/essl to aid further research in E-SSL.

preprint2021arXiv

Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science

Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labelled data needed to train the model; this poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Here, we introduce surrogate- and invariance-boosted contrastive learning (SIB-CL), a deep learning framework which incorporates three ``inexpensive'' and easily obtainable auxiliary information sources to overcome data scarcity. Specifically, these are: 1)~abundant unlabeled data, 2)~prior knowledge of symmetries or invariances and 3)~surrogate data obtained at near-zero cost. We demonstrate SIB-CL's effectiveness and generality on various scientific problems, e.g., predicting the density-of-states of 2D photonic crystals and solving the 3D time-independent Schrodinger equation. SIB-CL consistently results in orders of magnitude reduction in the number of labels needed to achieve the same network accuracies.

preprint2020arXiv

Contextualizing Enhances Gradient Based Meta Learning

Meta learning methods have found success when applied to few shot classification problems, in which they quickly adapt to a small number of labeled examples. Prototypical representations, each representing a particular class, have been of particular importance in this setting, as they provide a compact form to convey information learned from the labeled examples. However, these prototypes are just one method of representing this information, and they are narrow in their scope and ability to classify unseen examples. We propose the implementation of contextualizers, which are generalizable prototypes that adapt to given examples and play a larger role in classification for gradient-based models. We demonstrate how to equip meta learning methods with contextualizers and show that their use can significantly boost performance on a range of few shot learning datasets. We also present figures of merit demonstrating the potential benefits of contextualizers, along with analysis of how models make use of them. Our approach is particularly apt for low-data environments where it is difficult to update parameters without overfitting. Our implementation and instructions to reproduce the experiments are available at https://github.com/naveace/proto-context.

preprint2015arXiv

Weitzenboeck derivations of free metabelian associative algebras

By the classical theorem of Weitzenboeck the algebra of constants (i.e., the kernel) of a nonzero locally nilpotent linear derivation of the polynomial algebra K[X] in d variables over a field K of characteristic 0 is finitely generated. As a noncommutative generalization one considers the algebra of constants of a locally nilpotent linear derivation of a d-generated relatively free algebra F(V) in a variety V of unitary associative algebras over K. It is known that the algebra of constants of F(V) is finitely generated if and only if V satisfies a polynomial identity which does not hold for the algebra of 2 x 2 upper triangular matrices. Hence the free metabelian associative algebra F(M) is a crucial object to study. We show that the vector space of the constants in the commutator ideal F'(M) is a finitely generated module of the algebra of constants of the polynomial algebra K[U,V] in 2d variables, where the derivation acts on U and V in the same way as on X. For small d, we calculate the Hilbert series of the constants in F'(M) and find the generators of the related module. This gives also an (infinite) set of generators of the algebra of constants in F(M).

preprint2013arXiv

Weitzenboeck derivations of free metabelian Lie algebras

A nonzero locally nilpotent linear derivation of the polynomial algebra K[X] in d variables over a field K of characteristic 0 is called a Weitzenboeck derivation. The classical theorem of Weitzenboeck states that the algebra of constants (which coincides with the algebra of invariants of a single unipotent transformation) is finitely generated. Similarly one may consider the algebra of constants of a locally nilpotent linear derivation of a finitely generated (not necessarily commutative or associative) algebra which is relatively free in a variety of algebras over K. Now the algebra of constants is usually not finitely generated. Except for some trivial cases this holds for the algebra of constants of the free metabelian Lie algebra L/L" with d generators. We show that the vector space of the constants in the commutator ideal L'/L" is a finitely generated module over the algebra of constants in K[X]. For small d, we calculate the Hilbert series of the algebra of constants in L/L" and find the generators of the module of the constants in L'/L". This gives also an (infinite) set of generators of the Lie algebra of constants in L/L".