Source author record

Pushkar Mishra

Pushkar Mishra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Machine Learning Artificial Intelligence Data Structures and Algorithms Software Engineering

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Prescriptive and Descriptive Approaches to Machine-Learning Transparency

Specialized documentation techniques have been developed to communicate key facts about machine-learning (ML) systems and the datasets and models they rely on. Techniques such as Datasheets, FactSheets, and Model Cards have taken a mainly descriptive approach, providing various details about the system components. While the above information is essential for product developers and external experts to assess whether the ML system meets their requirements, other stakeholders might find it less actionable. In particular, ML engineers need guidance on how to mitigate potential shortcomings in order to fix bugs or improve the system's performance. We survey approaches that aim to provide such guidance in a prescriptive way. We further propose a preliminary approach, called Method Cards, which aims to increase the transparency and reproducibility of ML systems by providing prescriptive documentation of commonly-used ML methods and techniques. We showcase our proposal with an example in small object detection, and demonstrate how Method Cards can communicate key considerations for model developers. We further highlight avenues for improving the user experience of ML engineers based on Method Cards.

preprint2022arXiv

Ruddit: Norms of Offensiveness for English Reddit Comments

On social media platforms, hateful and offensive language negatively impact the mental well-being of users and the participation of people from diverse backgrounds. Automatic methods to detect offensive language have largely relied on datasets with categorical labels. However, comments can vary in their degree of offensiveness. We create the first dataset of English language Reddit comments that has fine-grained, real-valued scores between -1 (maximally supportive) and 1 (maximally offensive). The dataset was annotated using Best--Worst Scaling, a form of comparative annotation that has been shown to alleviate known biases of using rating scales. We show that the method produces highly reliable offensiveness scores. Finally, we evaluate the ability of widely-used neural models to predict offensiveness scores on this new dataset.

preprint2020arXiv

Joint Modelling of Emotion and Abusive Language Detection

The rise of online communication platforms has been accompanied by some undesirable effects, such as the proliferation of aggressive and abusive behaviour online. Aiming to tackle this problem, the natural language processing (NLP) community has experimented with a range of techniques for abuse detection. While achieving substantial success, these methods have so far only focused on modelling the linguistic properties of the comments and the online communities of users, disregarding the emotional state of the users and how this might affect their language. The latter is, however, inextricably linked to abusive behaviour. In this paper, we present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework that allows one task to inform the other. Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.

preprint2016arXiv

A New Algorithm for Updating and Querying Sub-arrays of Multidimensional Arrays

Given a $d$-dimensional array $A$, an update operation adds a given constant $C$ to each element within a continuous sub-array of $A$. A query operation computes the sum of all the elements within a continuous sub-array of $A$. The one-dimensional update and query handling problem has been studied intensively and is usually solved using segment trees with lazy propagation technique. In this paper, we present a new algorithm incorporating Binary Indexed Trees and Inclusion-Exclusion Principle to accomplish the same task. We extend the algorithm to update and query sub-matrices of matrices (two-dimensional array). Finally, we propose a general form of the algorithm for $d$-dimensions which achieves $\mathcal{O}(4^d*\log^{d}n)$ time complexity for both updates and queries. This is an improvement over the previously known algorithms which utilize hierarchical data structures like quadtrees and octrees and have a worst-case time complexity of $Ω(n^{d-1})$ per update/query.