Source author record

Iyiola E. Olatunji

Iyiola E. Olatunji appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Cryptography and Security Machine Learning Artificial Intelligence Information Retrieval Software Engineering

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

Can large language models reliably express a human-like personality, or are they merely mimicking surface cues without a stable underlying profile? To investigate this, we induce personality in LLMs by fine-tuning them on the long-form essays, where each essay is associated with a target Big Five personality profile. We then evaluate the stability and fidelity of the induced personality using the IPIP-NEO questionnaire. Specifically, we ask: (i) does post-training (SFT, DPO, ORPO) stabilize questionnaire scores under prompt rephrasings, and (ii) can it induce target Big Five profiles from unguided essays? Our results demonstrate that fine-tuning consistently reduces variance in questionnaire responses across five models, directly mitigating the evaluation fragility reported in pre-trained models. However, this newfound stability reveals a more fundamental limitation: accuracy on the full five-dimensional profile remains near chance, even when single-trait scores improve. This indicates that unguided essays lack the cues needed for faithful personality expression. We therefore argue for scenario-grounded datasets or interactive elicitation that accumulates test-aligned evidence over time.

preprint2026arXiv

How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

Recent secure code generation methods, using vulnerability-aware fine-tuning, prefix-tuning, and prompt optimization, claim to prevent LLMs from producing insecure code. However, their robustness under adversarial conditions remains untested, and current evaluations decouple security from functionality, potentially inflating reported gains. We present the first systematic adversarial audit of state-of-the-art secure code generation methods (SVEN, SafeCoder, PromSec). We subject them to realistic prompt perturbations such as paraphrasing, cue inversion, and context manipulation that developers might inadvertently introduce or adversaries deliberately exploit. To enable fair comparison, we evaluate all methods under consistent conditions, jointly assessing security and functionality using multiple analyzers and executable tests. Our findings reveal critical robustness gaps: static analyzers overestimate security by 7 to 21 times, with 37 to 60% of ``secure'' outputs being non-functional. Under adversarial conditions, true secure-and-functional rates collapse to 3 to 17%. Based on these findings, we propose best practices for building and evaluating robust secure code generation methods. Our code is available.

preprint2026arXiv

Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

While scaling laws govern aggregate large language model performance, no scaling law has linked factual recall to both model size and training-data composition. We evaluated 38 models on over 8,900 scholarly references evaluated by an automated reference verification system. Recall quality follows a sigmoid in the log-linear combination of model parameter count and topic representation in training data. These two variables alone explain 60% of the variance across 16 dense models from four families, rising to 74-94% within individual families. The form matches a superposition-inspired account in which recall is gated by a signal-to-noise ratio: signal strength scales with concept frequency and the noise floor with model capacity.

preprint2021arXiv

A Review of Anonymization for Healthcare Data

Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, reduced cost, and it drives innovative solutions within the healthcare sector. However, health data is highly sensitive and subject to regulations such as the General Data Protection Regulation (GDPR), which aims to ensure patient's privacy. Anonymization or removal of patient identifiable information, though the most conventional way, is the first important step to adhere to the regulations and incorporate privacy concerns. In this paper, we review the existing anonymization techniques and their applicability to various types (relational and graph-based) of health data. Besides, we provide an overview of possible attacks on anonymized data. We illustrate via a reconstruction attack that anonymization though necessary, is not sufficient to address patient privacy and discuss methods for protecting against such attacks. Finally, we discuss tools that can be used to achieve anonymization.

preprint2020arXiv

Context-aware Helpfulness Prediction for Online Product Reviews

Modeling and prediction of review helpfulness has become more predominant due to proliferation of e-commerce websites and online shops. Since the functionality of a product cannot be tested before buying, people often rely on different kinds of user reviews to decide whether or not to buy a product. However, quality reviews might be buried deep in the heap of a large amount of reviews. Therefore, recommending reviews to customers based on the review quality is of the essence. Since there is no direct indication of review quality, most reviews use the information that ''X out of Y'' users found the review helpful for obtaining the review quality. However, this approach undermines helpfulness prediction because not all reviews have statistically abundant votes. In this paper, we propose a neural deep learning model that predicts the helpfulness score of a review. This model is based on convolutional neural network (CNN) and a context-aware encoding mechanism which can directly capture relationships between words irrespective of their distance in a long sequence. We validated our model on human annotated dataset and the result shows that our model significantly outperforms existing models for helpfulness prediction.

Iyiola E. Olatunji

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

A Review of Anonymization for Healthcare Data

Context-aware Helpfulness Prediction for Online Product Reviews