Source author record

Maryum Sayeed

Maryum Sayeed appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

1works
3topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2022arXiv

Predicting Winners of the Reality TV Dating Show $\textit{The Bachelor}$ Using Machine Learning Algorithms

$\textit{The Bachelor}$ is a reality TV dating show in which a single bachelor selects his wife from a pool of approximately 30 female contestants over eight weeks of filming (American Broadcasting Company 2002). We collected the following data on all 422 contestants that participated in seasons 11 through 25: their Age, Hometown, Career, Race, Week they got their first 1-on-1 date, whether they got the first impression rose, and what "place" they ended up getting. We then trained three machine learning models to predict the ideal characteristics of a successful contestant on $\textit{The Bachelor}$. The three algorithms that we tested were: random forest classification, neural networks, and linear regression. We found consistency across all three models, although the neural network performed the best overall. Our models found that a woman has the highest probability of progressing far on $\textit{The Bachelor}$ if she is: 26 years old, white, from the Northwest, works as an dancer, received a 1-on-1 in week 6, and did not receive the First Impression Rose. Our methodology is broadly applicable to all romantic reality television, and our results will inform future $\textit{The Bachelor}$ production and contestant strategies. While our models were relatively successful, we still encountered high misclassification rates. This may be because: (1) Our training dataset had fewer than 400 points or (2) Our models were too simple to parameterize the complex romantic connections contestants forge over the course of a season.