Researcher profile

Anjan Dutta

Anjan Dutta contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

Abstracting Sketches through Simple Primitives

Humans show high-level of abstraction capabilities in games that require quickly communicating object information. They decompose the message content into multiple parts and communicate them in an interpretable protocol. Toward equipping machines with such capabilities, we propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primitives under the influence of a budget. To solve this task, our Primitive-Matching Network (PMN), learns interpretable abstractions of a sketch in a self supervised manner. Specifically, PMN maps each stroke of a sketch to its most similar primitive in a given set, predicting an affine transformation that aligns the selected primitive to the target stroke. We learn this stroke-to-primitive mapping end-to-end with a distance-transform loss that is minimal when the original sketch is precisely reconstructed with the predicted primitives. Our PMN abstraction empirically achieves the highest performance on sketch recognition and sketch-based image retrieval given a communication budget, while at the same time being highly interpretable. This opens up new possibilities for sketch analysis, such as comparing sketches by extracting the most relevant primitives that define an object category. Code is available at https://github.com/ExplainableML/sketch-primitives.

preprint2022arXiv

BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR

The efficacy of zero-shot sketch-based image retrieval (ZS-SBIR) models is governed by two challenges. The immense distributions-gap between the sketches and the images requires a proper domain alignment. Moreover, the fine-grained nature of the task and the high intra-class variance of many categories necessitates a class-wise discriminative mapping among the sketch, image, and the semantic spaces. Under this premise, we propose BDA-SketRet, a novel ZS-SBIR framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs progressively. In order to highlight the shared features and reduce the effects of any sketch or image-specific artifacts, we propose a novel symmetric loss function based on the notion of information bottleneck for aligning the semantic features while a cross-entropy-based adversarial loss is introduced to align the spatial feature maps. Finally, our CNN-based model confirms the discriminativeness of the shared latent space through a novel topology-preserving semantic projection network. Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw datasets exhibit sharp improvements over the literature.

preprint2022arXiv

Discovery of cyclotron and narrow Fe K$_α$ lines in HMXB GRO J1750-27

We report on timing and spectral analysis of transient Be X-ray pulsar GRO J1750-27 by using the Nuclear Spectroscopic Telescope Array (NuSTAR) observation from September 2021. This is the fourth outburst of the system since 1995. The NuSTAR observation was performed during the rising phase of the outburst. Pulsations at a period of 4.450710(1) s were observed in the 3-60 keV energy range. The average pulse profile comprised of a broad peak with a weak secondary peak which evolved with energy. We did not find any appreciable variation in the X-ray emission during this observation. The broad-band phase-averaged spectrum is described by a blackbody, a powerlaw or Comptonization component. We report discovery of Fe K$_α$ line at 6.4 keV along with presence of two cyclotron resonant scattering features around 36 and 42 keV. These lines indicate a magnetic field strength of $3.7_{-0.3}^{+0.1} \times 10^{12}$ and $4.4 \pm 0.1 \times 10^{12}$ G for the neutron star. We have estimated a source distance of $\sim$ 13.6-16.4 kpc based on the accretion-disc torque models.

preprint2020arXiv

A broadband look of the Accreting Millisecond X-ray Pulsar SAX J1748.9-2021 using AstroSat and XMM-Newton

SAX J1748.9-2021 is a transient accretion powered millisecond X-ray pulsar located in the Globular cluster NGC 6440. We report on the spectral and timing analysis of SAX J1748.9-2021 performed on AstroSat data taken during its faint and short outburst of 2017. We derived the best-fitting orbital solution for the 2017 outburst and obtained an average local spin frequency of 442.361098(3) Hz. The pulse profile obtained from 3-7 keV and 7-20 keV energy bands suggest constant fractional amplitude ~0.5% for fundamental component, contrary to previously observed energy pulse profile dependence. Our AstroSat observations revealed the source to be in a hard spectral state. The 1-50 keV spectrum from SXT and LAXPC on-board AstroSat can be well described with a single temperature blackbody and thermal Comptonization. Moreover, we found that the combined spectra from XMM-Newton (EPIC-PN) and AstroSat (SXT+LAXPC) indicated the presence of reflection features in the form of iron (Fe K$α$) line that we modeled with the reflection model xillvercp. One of the two X-ray burst observed during the AstroSat/LAXPC observation showed hard X-ray emission (>30 keV) due to Compton up-scattering of thermal photons by the hot corona. Time resolved analysis performed on the bursts revealed complex evolution in emission radius of blackbody for second burst suggestive of mild photospheric radius expansion.

preprint2020arXiv

Bookworm continual learning: beyond zero-shot learning and continual learning

We propose bookworm continual learning(BCL), a flexible setting where unseen classes can be inferred via a semantic model, and the visual model can be updated continually. Thus BCL generalizes both continual learning (CL) and zero-shot learning (ZSL). We also propose the bidirectional imagination (BImag) framework to address BCL where features of both past and future classes are generated. We observe that conditioning the feature generator on attributes can actually harm the continual learning ability, and propose two variants (joint class-attribute conditioning and asymmetric generation) to alleviate this problem.

preprint2020arXiv

Broad-band spectral analysis of LMXB XTE J1710-281 with Suzaku

This work presents the broad-band time-averaged spectral analysis of neutron star low-mass X-ray binary, XTE J1710-281 by using the Suzaku archival data. The source was in a hard or an intermediate spectral state during this observation. This is the first time that a detailed spectral analysis of the persistent emission spectra of XTE J1710-281 has been done up to 30 keV with improved constraints on its spectral parameters. By simultaneously fitting the XIS (0.6-9.0 keV) and the HXD-PIN (15.0-30.0 keV) data, we have modelled the persistent spectrum of the source with models comprising a soft component from accretion disc and/or neutron star surface/boundary layer and a hard Comptonizing component. The 0.6-30 keV continuum with neutral absorber can be described by a multi-colour disc blackbody with an inner disc temperature of $kT_{\rm disc} = 0.28$ keV, which is significantly Comptonized by the hot electron cloud with electron temperature of $kT_{\rm e} \approx 5$ keV and described by photon index $Γ= 1.86$. A more complex three-component model comprising a multi-colour disc blackbody $\approx 0.30$ keV, single temperature blackbody $\approx 0.65$ keV and Comptonization from the disc, partially absorbed (about 38 per cent) by an ionized absorber (log($ξ$) $\approx$ 4) describes the broad-band spectrum equally well.

preprint2020arXiv

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research

preprint2020arXiv

Hierarchical stochastic graphlet embedding for graph-based pattern recognition

Despite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes, and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the Stochastic Graphlet Embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low to high order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.

preprint2020arXiv

Learning Robust Representations via Multi-View Information Bottleneck

The information bottleneck principle provides an information-theoretic method for representation learning, by training an encoder to retain all information which is relevant for predicting the label while minimizing the amount of other, excess information in the representation. The original formulation, however, requires labeled data to identify the superfluous information. In this work, we extend this ability to the multi-view unsupervised setting, where two views of the same underlying entity are provided but the label is unknown. This enables us to identify superfluous information as that not shared by both views. A theoretical analysis leads to the definition of a new multi-view model that produces state-of-the-art results on the Sketchy dataset and label-limited versions of the MIR-Flickr dataset. We also extend our theory to the single-view setting by taking advantage of standard data augmentation techniques, empirically showing better generalization capabilities when compared to common unsupervised approaches for representation learning.

preprint2020arXiv

Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based Image Retrieval

Low-shot sketch-based image retrieval is an emerging task in computer vision, allowing to retrieve natural images relevant to hand-drawn sketch queries that are rarely seen during the training phase. Related prior works either require aligned sketch-image pairs that are costly to obtain or inefficient memory fusion layer for mapping the visual information to a semantic space. In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks, where we introduce the few-shot setting for SBIR. For solving these tasks, we propose a semantically aligned paired cycle-consistent generative adversarial network (SEM-PCYC) for any-shot SBIR, where each branch of the generative adversarial network maps the visual information from sketch and image to a common semantic space via adversarial training. Each of these branches maintains cycle consistency that only requires supervision at the category level, and avoids the need of aligned sketch-image pairs. A classification criteria on the generators' outputs ensures the visual to semantic space mapping to be class-specific. Furthermore, we propose to combine textual and hierarchical side information via an auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in any-shot SBIR performance over the state-of-the-art on the extended version of the challenging Sketchy, TU-Berlin and QuickDraw datasets.