Researcher profile

Monowar Hasan

Monowar Hasan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Pitfalls of Evaluating Language Models with Open Benchmarks

Open Large Language Model (LLM) benchmarks, such as HELM and BIG-Bench, provide standardized and transparent evaluation protocols that support comparative analysis, reproducibility, and systematic progress tracking in Language Model (LM) research. Yet, this openness also creates substantial risks of data leakage during LM testing--deliberate or inadvertent, thereby undermining the fairness and reliability of leaderboard rankings and leaving them vulnerable to manipulation by unscrupulous actors. We illustrate the severity of this issue by intentionally constructing cheating models: smaller variants of BART, T5, and GPT-2, fine-tuned directly on publicly available test-sets. As expected, these models excel on the target benchmarks but fail terribly to generalize to comparable unseen testing sets. We then examine task specific simple paraphrase-based safeguarding strategies to mitigate the impact of data leakage and evaluate their effectiveness and limitations. Our findings underscore three key points: (i) high leaderboard performance on limited open, static benchmarks may not reflect real-world utility; (ii) private or dynamically generated benchmarks should complement open benchmarks to maintain evaluation integrity; and (iii) a reexamination of current benchmarking practices is essential for reliable and trustworthy LM assessment.

preprint2022arXiv

Ellipsis: Towards Efficient System Auditing for Real-Time Systems

System auditing is a powerful tool that provides insight into the nature of suspicious events in computing systems, allowing machine operators to detect and subsequently investigate security incidents. While auditing has proven invaluable to the security of traditional computers, existing audit frameworks are rarely designed with consideration for Real-Time Systems (RTS). The transparency provided by system auditing would be of tremendous benefit in a variety of security-critical RTS domains, (e.g., autonomous vehicles); however, if audit mechanisms are not carefully integrated into RTS, auditing can be rendered ineffectual and violate the real-world temporal requirements of the RTS. In this paper, we demonstrate how to adapt commodity audit frameworks to RTS. Using Linux Audit as a case study, we first demonstrate that the volume of audit events generated by commodity frameworks is unsustainable within the temporal and resource constraints of real-time (RT) applications. To address this, we present Ellipsis, a set of kernel-based reduction techniques that leverage the periodic repetitive nature of RT applications to aggressively reduce the costs of system-level auditing. Ellipsis generates succinct descriptions of RT applications' expected activity while retaining a detailed record of unexpected activities, enabling analysis of suspicious activity while meeting temporal constraints. Our evaluation of Ellipsis, using ArduPilot (an open-source autopilot application suite) demonstrates up to 93% reduction in audit log generation.

preprint2020arXiv

Period Adaptation for Continuous Security Monitoring in Multicore Real-Time Systems

We propose a design-time framework (named HYDRA-C) for integrating security tasks into partitioned real-time systems (RTS) running on multicore platforms. Our goal is to opportunistically execute security monitoring mechanisms in a 'continuous' manner -- i.e., as often as possible, across cores, to ensure that security tasks run with as few interruptions as possible. Our framework will allow designers to integrate security mechanisms without perturbing existing real-time (RT) task properties or execution order. We demonstrate the framework using a proof-of-concept implementation with intrusion detection mechanisms as security tasks. We develop and use both, (a) a custom intrusion detection system (IDS), as well as (b) Tripwire -- an open source data integrity checking tool. These are implemented on a realistic rover platform designed using an ARM multicore chip. We compare the performance of HYDRA-C with a state-of-the-art RT security integration approach for multicore-based RTS and find that our method can, on average, detect intrusions 19.05% faster without impacting the performance of RT tasks.

preprint2020arXiv

Securing Vehicle-to-Everything (V2X) Communication Platforms

Modern vehicular wireless technology enables vehicles to exchange information at any time, from any place, to any network -- forms the vehicle-to-everything (V2X) communication platforms. Despite benefits, V2X applications also face great challenges to security and privacy -- a very valid concern since breaches are not uncommon in automotive communication networks and applications. In this survey, we provide an extensive overview of V2X ecosystem. We also review main security/privacy issues, current standardization activities and existing defense mechanisms proposed within the V2X domain. We then identified semantic gaps of existing security solutions and outline possible open issues.