Researcher profile

Sumeet Kumar

Sumeet Kumar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube

Over the last few years, YouTube Kids has emerged as one of the highly competitive alternatives to television for children's entertainment. Consequently, YouTube Kids' content should receive an additional level of scrutiny to ensure children's safety. While research on detecting offensive or inappropriate content for kids is gaining momentum, little or no current work exists that investigates to what extent AI applications can (accidentally) introduce content that is inappropriate for kids. In this paper, we present a novel (and troubling) finding that well-known automatic speech recognition (ASR) systems may produce text content highly inappropriate for kids while transcribing YouTube Kids' videos. We dub this phenomenon as \emph{inappropriate content hallucination}. Our analyses suggest that such hallucinations are far from occasional, and the ASR systems often produce them with high confidence. We release a first-of-its-kind data set of audios for which the existing state-of-the-art ASR systems hallucinate inappropriate content for kids. In addition, we demonstrate that some of these errors can be fixed using language models.

preprint2020arXiv

Pitch-rotational manipulation of single cells and particles using single-beam thermo-optical tweezers

3D pitch rotation of microparticles and cells assumes importance in a wide variety of applications in biology, physics, chemistry and medicine. Applications such as cell imaging and injection benefit from pitch-rotational manipulation. Generation of such motion in single beam optical tweezers has remained elusive due to complicacies of generating high enough ellipticity perpendicular to the direction of propagation. Further, trapping an extended object at two locations can only generate partial pitch motion by moving one of the foci in the axial direction. Here, we use hexagonal-shaped upconverting particles and single cells trapped close to a gold-coated glass cover slip in a sample chamber to generate complete 360 degree and continuous pitch motion even with a single optical tweezers beam. The tweezers beam passing through the gold surface is partially absorbed and generates a hot-spot to produce circulatory convective flows in the vicinity which rotates the objects. The rotation rate can be controlled by the intensity of the laser light and the thickness of the gold layer. Thus such a simple configuration can turn the particle in the pitch sense. The circulatory flows in this technique have a diameter of about 5 $μ$m which is smaller than those reported using acousto-fluidic techniques.

preprint2020arXiv

Stance in Replies and Quotes (SRQ): A New Dataset For Learning Stance in Twitter Conversations

Automated ways to extract stance (denying vs. supporting opinions) from conversations on social media are essential to advance opinion mining research. Recently, there is a renewed excitement in the field as we see new models attempting to improve the state-of-the-art. However, for training and evaluating the models, the datasets used are often small. Additionally, these small datasets have uneven class distributions, i.e., only a tiny fraction of the examples in the dataset have favoring or denying stances, and most other examples have no clear stance. Moreover, the existing datasets do not distinguish between the different types of conversations on social media (e.g., replying vs. quoting on Twitter). Because of this, models trained on one event do not generalize to other events. In the presented work, we create a new dataset by labeling stance in responses to posts on Twitter (both replies and quotes) on controversial issues. To the best of our knowledge, this is currently the largest human-labeled stance dataset for Twitter conversations with over 5200 stance labels. More importantly, we designed a tweet collection methodology that favors the selection of denial-type responses. This class is expected to be more useful in the identification of rumors and determining antagonistic relationships between users. Moreover, we include many baseline models for learning the stance in conversations and compare the performance of various models. We show that combining data from replies and quotes decreases the accuracy of models indicating that the two modalities behave differently when it comes to stance learning.

preprint2019arXiv

Anomalous diffusion in an electrolyte saturated paper matrix

Diffusion of colored dye on water saturated paper substrates has been traditionally exploited with great skill by renowned water color artists. The same physics finds more recent practical applications in paper based diagnostic devices deploying chemicals that react with a bodily fluid yielding colorimetric signals for disease detection. During spontaneous imbibition through the tortuous pathways of a porous electrolyte saturated paper matrix, a dye molecule undergoes diffusion in a complex network of pores. The advancing front forms a strongly correlated interface that propagates diffusively but with an enhanced effective diffusivity. We measure this effective diffusivity and show that it is several orders of magnitude greater than the free solution diffusivity and has a significant dependence on the solution pH and salt concentration in the background electrolyte. We attribute this to electrically mediated interfacial interactions between the ionic species in the liquid dye and spontaneous surface charges developed at porous interfaces, and introduce a simple theory to explain this phenomenon.