Researcher profile

Mohd Zeeshan Ansari

Mohd Zeeshan Ansari contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Feature Selection on Noisy Twitter Short Text Messages for Language Identification

The task of written language identification involves typically the detection of the languages present in a sample of text. Moreover, a sequence of text may not belong to a single inherent language but also may be mixture of text written in multiple languages. This kind of text is generated in large volumes from social media platforms due to its flexible and user friendly environment. Such text contains very large number of features which are essential for development of statistical, probabilistic as well as other kinds of language models. The large number of features have rich as well as irrelevant and redundant features which have diverse effect over the performance of the learning model. Therefore, feature selection methods are significant in choosing feature that are most relevant for an efficient model. In this article, we basically consider the Hindi-English language identification task as Hindi and English are often two most widely spoken languages of India. We apply different feature selection algorithms across various learning algorithms in order to analyze the effect of the algorithm as well as the number of features on the performance of the task. The methodology focuses on the word level language identification using a novel dataset of 6903 tweets extracted from Twitter. Various n-gram profiles are examined with different feature selection algorithms over many classifiers. Finally, an exhaustive comparative analysis is put forward with respect to the overall experiments conducted for the task.

preprint2020arXiv

Inferring Political Preferences from Twitter

Sentiment analysis is the task of automatic analysis of opinions and emotions of users towards an entity or some aspect of that entity. Political Sentiment Analysis of social media helps the political strategists to scrutinize the performance of a party or candidate and improvise their weaknesses far before the actual elections. During the time of elections, the social networks get flooded with blogs, chats, debates and discussions about the prospects of political parties and politicians. The amount of data generated is much large to study, analyze and draw inferences using the latest techniques. Twitter is one of the most popular social media platforms enables us to perform domain-specific data preparation. In this work, we chose to identify the inclination of political opinions present in Tweets by modelling it as a text classification problem using classical machine learning. The tweets related to the Delhi Elections in 2020 are extracted and employed for the task. Among the several algorithms, we observe that Support Vector Machines portrays the best performance.