Source author record

Sabur Butt

Sabur Butt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence Machine Learning

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Forecasting Green Skill Demand in the Automotive Industry: Evidence from Online Job Postings

The global transition toward sustainable economies is reshaping labor markets, yet systematic methods for identifying and forecasting green skills remain limited. This study presents a computational framework to measure and predict green skill demand using online job postings from Mexico's automotive industry, which contributes about 4% of national GDP. We compile a dataset of job advertisements from Indeed Mexico, OCC Mundial, and LinkedIn (July 2024 to July 2025), yielding 204,373 skill records. A two-stage pipeline combining multilingual embeddings and ESCO validation identifies 274 unique green skills across 8,576 occurrences (4.22% of all skills). We benchmark 15 time series forecasting models using a rolling origin evaluation. Transformer-based models, especially FEDformer, Reformer, and Informer, achieve the best performance, with MAE around 2.5e-5 and relative RMSE below 15. We further propose a framework to classify skills by absolute and relative growth, identifying stable, emerging, and high-impact competencies. Results show current demand is concentrated in operational sustainability practices, while the fastest-growing skills relate to renewable energy, recycling, and hydrogen technologies. This pipeline supports data-driven workforce planning in the green transition.

preprint2022arXiv

Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021

Automatic detection of fake news is a highly important task in the contemporary world. This study reports the 2nd shared task called UrduFake@FIRE2021 on identifying fake news detection in Urdu. The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem, particularly for the Urdu language. The task is posed as a binary classification problem to label a given news article as a real or a fake news article. The organizers provide a dataset comprising news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business, split into training and testing sets. The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news. 34 teams from 7 different countries (China, Egypt, Israel, India, Mexico, Pakistan, and UAE) registered to participate in the UrduFake@FIRE2021 shared task. Out of those, 18 teams submitted their experimental results, and 11 of those submitted their technical reports, which is substantially higher compared to the UrduFake shared task in 2020 when only 6 teams submitted their technical reports. The technical reports submitted by the participants demonstrated different data representation techniques ranging from count-based BoW features to word vector embeddings as well as the use of numerous machine learning algorithms ranging from traditional SVM to various neural network architectures including Transformers such as BERT and RoBERTa. In this year's competition, the best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro. Admittedly, while training sets from the past and the current years overlap to a large extent, the testing set provided this year is completely different.

preprint2022arXiv

UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu

This study reports the second shared task named as UrduFake@FIRE2021 on identifying fake news detection in Urdu language. This is a binary classification problem in which the task is to classify a given news article into two classes: (i) real news, or (ii) fake news. In this shared task, 34 teams from 7 different countries (China, Egypt, Israel, India, Mexico, Pakistan, and UAE) registered to participate in the shared task, 18 teams submitted their experimental results and 11 teams submitted their technical reports. The proposed systems were based on various count-based features and used different classifiers as well as neural network architectures. The stochastic gradient descent (SGD) algorithm outperformed other classifiers and achieved 0.679 F-score.

Sabur Butt

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Forecasting Green Skill Demand in the Automotive Industry: Evidence from Online Job Postings

Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021

UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu