Source author record

Esteban Feuerstein

Esteban Feuerstein appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Machine Learning Computation and Language Data Structures and Algorithms Social and Information Networks

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Improved Topic modeling in Twitter through Community Pooling

Social networks play a fundamental role in propagation of information and news. Characterizing the content of the messages becomes vital for different tasks, like breaking news detection, personalized message recommendation, fake users detection, information flow characterization and others. However, Twitter posts are short and often less coherent than other text documents, which makes it challenging to apply text mining algorithms to these datasets efficiently. Tweet-pooling (aggregating tweets into longer documents) has been shown to improve automatic topic decomposition, but the performance achieved in this task varies depending on the pooling method. In this paper, we propose a new pooling scheme for topic modeling in Twitter, which groups tweets whose authors belong to the same community (group of users who mainly interact with each other but not with other groups) on a user interaction graph. We present a complete evaluation of this methodology, state of the art schemes and previous pooling models in terms of the cluster quality, document retrieval tasks performance and supervised machine learning classification score. Results show that our Community polling method outperformed other methods on the majority of metrics in two heterogeneous datasets, while also reducing the running time. This is useful when dealing with big amounts of noisy and short user-generated social media texts. Overall, our findings contribute to an improved methodology for identifying the latent topics in a Twitter dataset, without the need of modifying the basic machinery of a topic decomposition model.

preprint2020arXiv

Vocabulary-based Method for Quantifying Controversy in Social Media

Identifying controversial topics is not only interesting from a social point of view, it also enables the application of methods to avoid the information segregation, creating better discussion contexts and reaching agreements in the best cases. In this paper we develop a systematic method for controversy detection based primarily on the jargon used by the communities in social media. Our method dispenses with the use of domain-specific knowledge, is language-agnostic, efficient and easy to apply. We perform an extensive set of experiments across many languages, regions and contexts, taking controversial and non-controversial topics. We find that our vocabulary-based measure performs better than state of the art measures that are based only on the community graph structure. Moreover, we shows that it is possible to detect polarization through text analysis.

preprint2014arXiv

Scheduling over Scenarios on Two Machines

We consider scheduling problems over scenarios where the goal is to find a single assignment of the jobs to the machines which performs well over all possible scenarios. Each scenario is a subset of jobs that must be executed in that scenario and all scenarios are given explicitly. The two objectives that we consider are minimizing the maximum makespan over all scenarios and minimizing the sum of the makespans of all scenarios. For both versions, we give several approximation algorithms and lower bounds on their approximability. With this research into optimization problems over scenarios, we have opened a new and rich field of interesting problems.